Dear all,
I am testing ShExjava with a large-scale n-quad dataset, and I got some issues.
Context:
I am using Intel Core i7-8550U CPU1.80Ghz (4 cores), 16GB of RAM, using Windows 10 64-bit, Java 9.0.4, and Eclipse IDE 4.10. my run configuration has VM: -Xmx12G
ShExMain.java
`...
Model data = Rio.parse(new FileInputStream(dataFile.toFile()), baseIRI, RDFFormat.NQUADS);
Graph dataGraph = factory.asGraph(data);
...
String shMap = "{FOCUS a http://schema.org/CreativeWork}@http://example.org/NameShape";
...
try {
BaseShapeMap shapeMap = parser.parse(new ByteArrayInputStream(shMap.getBytes()));
RecursiveValidationWithMemorization algo = new RecursiveValidationWithMemorization(schema, dataGraph);
ResultShapeMap result = algo.validate(shapeMap);
} catch ( Exception e) {e.printStackTrace(); }`
Issues:
ShEx OutOfMemoryError: Java heap space
Question:
There is a proper(special) setup of ShEx for validating e.g., 1 billion of n-quads?
How many shapeMaps supports ShEx?
What is the file size/#triples/#n-quads that ShEx supports?
Under what configuration ShEx runs ideally?
How can someone scale ShEx approach to large-scale knowledge bases? (e.g., validating constraints directly agains SPARQL endpoints)
Is better PyShEx/ShEx scale/ShEx.js/ShExjava or?
Thank you so much for your time, please forgive if I wrote/stated something wrong.
Best regards,
Elwin
Dear all,
I am testing ShExjava with a large-scale n-quad dataset, and I got some issues.
Context:
I am using Intel Core i7-8550U CPU1.80Ghz (4 cores), 16GB of RAM, using Windows 10 64-bit, Java 9.0.4, and Eclipse IDE 4.10. my run configuration has VM: -Xmx12G
ShExMain.java
`...
Model data = Rio.parse(new FileInputStream(dataFile.toFile()), baseIRI, RDFFormat.NQUADS);
Graph dataGraph = factory.asGraph(data);
...
String shMap = "{FOCUS a http://schema.org/CreativeWork}@http://example.org/NameShape";
...
try {
BaseShapeMap shapeMap = parser.parse(new ByteArrayInputStream(shMap.getBytes()));
RecursiveValidationWithMemorization algo = new RecursiveValidationWithMemorization(schema, dataGraph);
ResultShapeMap result = algo.validate(shapeMap);
} catch ( Exception e) {e.printStackTrace(); }`
Issues:
ShEx OutOfMemoryError: Java heap space
Question:
There is a proper(special) setup of ShEx for validating e.g., 1 billion of n-quads?
How many shapeMaps supports ShEx?
What is the file size/#triples/#n-quads that ShEx supports?
Under what configuration ShEx runs ideally?
How can someone scale ShEx approach to large-scale knowledge bases? (e.g., validating constraints directly agains SPARQL endpoints)
Is better PyShEx/ShEx scale/ShEx.js/ShExjava or?
Thank you so much for your time, please forgive if I wrote/stated something wrong.
Best regards,
Elwin