Skip to content

ShEx java OutOfMemoryError: Java heap space #12

@ElwinHuaman

Description

@ElwinHuaman

Dear all,

I am testing ShExjava with a large-scale n-quad dataset, and I got some issues.

Context:
I am using Intel Core i7-8550U CPU1.80Ghz (4 cores), 16GB of RAM, using Windows 10 64-bit, Java 9.0.4, and Eclipse IDE 4.10. my run configuration has VM: -Xmx12G

ShExMain.java
`...
Model data = Rio.parse(new FileInputStream(dataFile.toFile()), baseIRI, RDFFormat.NQUADS);
Graph dataGraph = factory.asGraph(data);
...

String shMap = "{FOCUS a http://schema.org/CreativeWork}@http://example.org/NameShape";
...
try {
BaseShapeMap shapeMap = parser.parse(new ByteArrayInputStream(shMap.getBytes()));
RecursiveValidationWithMemorization algo = new RecursiveValidationWithMemorization(schema, dataGraph);
ResultShapeMap result = algo.validate(shapeMap);
} catch ( Exception e) {e.printStackTrace(); }`

Issues:
ShEx OutOfMemoryError: Java heap space

Question:
There is a proper(special) setup of ShEx for validating e.g., 1 billion of n-quads?
How many shapeMaps supports ShEx?
What is the file size/#triples/#n-quads that ShEx supports?
Under what configuration ShEx runs ideally?
How can someone scale ShEx approach to large-scale knowledge bases? (e.g., validating constraints directly agains SPARQL endpoints)
Is better PyShEx/ShEx scale/ShEx.js/ShExjava or?

Thank you so much for your time, please forgive if I wrote/stated something wrong.

Best regards,
Elwin

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions