Crawl the web on a large scale with StormCrawler and Elasticsearch

2022-02-11T19:00:00Z

19:0019:07
19:00 — 19:07 (UTC)

Crawl the web on a large scale with StormCrawler and Elasticsearch

StormCrawler is a popular and mature open-source web crawler. It is written in Java and is both lightweight and scalable, thanks to the distribution layer based on Apache Storm. One of the attractions of the crawler is that it is extensible and modular, as well as versatile. In this presentation, we will have a closer look at the Elasticsearch module of StormCrawler and see how it is being used in production by various organizations, sometimes on a very large scale.

Lightning talk Introductory and overview Enterprise search Stack
Julien Nioche
Director | DigitalPebble
The new Elasticsearch Java Client: getting started and behind the scenes