@@ -12,7 +12,7 @@ These are basic usage examples for the library. When the library was created the
12
12
13
13
Stream a WARC file from an URL and print the payload (response body) to the console.
14
14
15
- ```
15
+ ``` java
16
16
final URL warcUrl = new URL (
17
17
" https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2018-43/segments/1539583508988.18/warc/CC-MAIN-20181015080248-20181015101748-00000.warc.gz" );
18
18
@@ -26,7 +26,7 @@ WarcRecordStreamFactory.streamOf(warcUrl)
26
26
27
27
Read WARC records from a file one by one using the WarcReader class.
28
28
29
- ```
29
+ ``` java
30
30
final WarcReader warcReader = new WarcReader (new FileInputStream (
31
31
new File (" C:\\ warc-test\\ CC-MAIN-20180716232549-20180717012549-00001.warc.gz" )));
32
32
@@ -48,20 +48,49 @@ while (hasNext) {
48
48
}
49
49
```
50
50
51
+ ### Reactive extensions
52
+
53
+ If you want a Flux of WarcRecords you should use the reactive module like:
54
+
55
+ ``` java
56
+ final URL warcUrl = new URL (
57
+ " https://commoncrawl.s3.amazonaws.com/crawl-data/CC-MAIN-2018-43/segments/1539583508988.18/warc/CC-MAIN-20181015080248-20181015101748-00000.warc.gz" );
58
+
59
+ WarcRecordFluxFactory . buildWarcRecordFlux(warcUrl)
60
+ .filter(WarcRecord :: isResponse)
61
+ .map(entry - > ((ResponseContentBlock ) entry. getWarcContentBlock()). getPayloadAsString())
62
+ ...
63
+ ```
64
+
51
65
### Installation
52
66
53
67
The library is available in maven central.
54
68
55
69
You can use it with maven:
56
- ```
70
+ ``` xml
57
71
<dependency >
58
72
<groupId >com.github.bottomless-archive-project</groupId >
59
73
<artifactId >java-warc</artifactId >
60
- <version>1.0.0</version>
74
+ <version >1.2.0</version >
75
+ </dependency >
76
+ ```
77
+
78
+ Or gradle:
79
+ ``` groovy
80
+ implementation 'com.github.bottomless-archive-project:java-warc:1.2.0'
81
+ ```
82
+
83
+ If you want to use the reactive module use you can use it with maven:
84
+
85
+ ``` xml
86
+ <dependency >
87
+ <groupId >com.github.bottomless-archive-project</groupId >
88
+ <artifactId >java-warc-reactive</artifactId >
89
+ <version >1.2.0</version >
61
90
</dependency >
62
91
```
63
92
64
93
Or gradle:
94
+ ``` groovy
95
+ implementation 'com.github.bottomless-archive-project:java-warc-reactive:1.2.0'
65
96
```
66
- implementation 'com.github.bottomless-archive-project:java-warc:1.0.0'
67
- ```
0 commit comments