Real Time News
Python3, Apache Spark, MongoDB, Druid, ReactJS
The product shows near real-time overview of trending terms over digital news sources using NLP (Natural Language Processing). It allows analysts to quickly identify the trends and outline news peaks through visualization tool., providing the essential information that was missing before the product launch.
Backend part consumes the data from digital news portal and processes keywords and phrases using highly scalable component fueled by NLP models. The processed data is being stored in high performing indexed databases, whilst raw data is being stored into data lake. The timeseries of processed data is plotted on the filterable graph and allows end-user to navigate thought time and topics with ease.
Product enables adding new sources, in-application mapping and model customization by end-user. Importance of different consuming methods lies in various connectors to news portals like RSS, web scaping, REST-API consuming and similar.