This application scrapes a product’s Amazon reviews, extracts entities, and reveals sentiment.
This application uses the jsoup library to scrape and parse HTML from a URL and extract product reviews using CSS selectors.
This application communicates with the TextRazor API to extract entities.
This application communicates with the Google Cloud NL API to detect sentiment score and magnitude.
After cloning the project, you must set up the following two environment variables:
- TEXT_RAZOR_API_KEY: ***
- GOOGLE_APPLICATION_CREDENTIALS: service-account-file.json
For more info, see https://www.textrazor.com/signup & https://cloud.google.com/docs/authentication/production.
You also may need to install Apache Maven (https://maven.apache.org/) on your system.
mvn clean compilemvn clean compile assembly:singlemvn -q clean compile exec:java -Dexec.executable="service.Main" mvn clean compile test checkstyle:check spotbugs:checkTo see bug details using the Findbugs GUI, use the following command "mvn findbugs:gui"
Or you can create a XML report by using
mvn spotbugs:gui or
mvn spotbugs:spotbugsmvn spotbugs:check For more info see https://spotbugs.readthedocs.io/en/latest/maven.html
CheckStyle code styling configuration files are in config/ directory. Maven checkstyle plugin is set to use google code style.
mvn checkstyle:checkGenerate a report in XML format:
target/checkstyle-checker.xml
target/checkstyle-result.xmlGenerate a report in HTML format:
mvn checkstyle:checkstyletarget/site/checkstyle.html