Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request Timeout when running crossvalidation #14

Closed
alphaville opened this issue May 23, 2011 · 12 comments
Closed

Request Timeout when running crossvalidation #14

alphaville opened this issue May 23, 2011 · 12 comments

Comments

@alphaville
Copy link

Occasionally, a Request Timeout exception is thrown (see for example http://toxcreate2.in-silico.ch/task/4332 ). I have the feeling that this happens under heavy load on the server. I've checked the response times at opentox.ntua.gr:8080 and they remain low (see http://ambit.uni-plovdiv.bg/cgi-bin/smokeping.cgi?target=NTUA and http://opentox.ntua.gr:8080/monitoring).

@mguetlein
Copy link
Owner

Pantelis, this timeout is thrown at your service. You should check your web-server error and access logs.

@alphaville
Copy link
Author

I think this is of some interest: http://opentox.ntua.gr/index.php/blog/76-rdf-opentox-discussion?showall=1&limitstart= - Especially the last paragraph about RDF vs ARFF. It explains the timeout. Could you provide ARFF along with RDF for datasets?

@mguetlein
Copy link
Owner

the timeout in this task (http://toxcreate2.in-silico.ch/task/4332) happens during a simple get request to your service:

 rest_params: 
    :headers: 
      :accept: application/rdf+xml
      :subjectid: AQIC5wM2LY4SfcyXpalQEtoyjxZzHhZIMARV18Unjdb27k8=@AAJTSQACMDE=#
    :payload: 
    :rest_uri: http://opentox.ntua.gr:8080/model/9b84be8c-87d6-4405-aad8-bd7cfc81251e

so, this should have nothing to do with rdf parsing.

not sure if we will provide arff. It should be not too much of an effort, but I think christoph and nina planned to replace rdf owl-dl with a fixed datamodel, represented for example in jason.

@alphaville
Copy link
Author

Did you read the blog? RDF parsing consumes 2.79GB compared to 1MB for ARFF. (see http://opentox.ntua.gr/index.php/blog/76-rdf-opentox-discussion?showall=1&limitstart= ). It consumes all the RAM of the server and it starts using swap and it responds too slowly.

@mguetlein mguetlein reopened this May 26, 2011
@mguetlein
Copy link
Owner

Sorry, Pantelis I did not read the blog completely.

But the timeout occurs during a simple get request to your model. Do you parse a big rdf file when a get request to an existing model is performed?

@mguetlein
Copy link
Owner

No description provided.

@mguetlein
Copy link
Owner

One more thing Pantelis:
I do think that rdf scalablity issue is a severe problem, and we have to solve it, and and it is good that do you this investigations.
But IMHO this should still never cause timeouts during the model building process. This is what tasks are for. First the model building service return the task to the client. Then it starts proceeding the rdf data.

@alphaville
Copy link
Author

Yes, that's right. First a task is created with status QUEUED. Up to that time nothing happens. After that, and unless there not more than 2 other tasks running on the system, the task is submitted to the execution pool. Then it starts downloading and parsing stuff and stuffs the memory with RDF triples... and then the system hangs and crawls ;) Any other running tasks hang too! Even the apache server is dead at that point. If you stand in front of the screen of this computer you're hardly able to move the mouse pointer. The reason is because the whole RAM is occupied and in some cases even half of the swap space!!! The same holds for GET on /task/id. Therefore, it is a matter of RDF scalability.

@mguetlein
Copy link
Owner

;-))) very nice description. I see. We should enforce the scalability issue on the mailing list...

@vedina
Copy link

vedina commented May 27, 2011

the task is submitted to the execution pool. Then it starts downloading and parsing stuff and stuffs the memory with >RDF triples... and then the system hangs and crawls ;)

It seems like the downloading and parsing is done on one go. If so, will be less blocking, if the task is accepted and the task URI is returned immediately. Then the download starts and writes data into a file,and only upon completion is parsed into RDF.

@alphaville
Copy link
Author

The task is returned immediately to the client. This is the first action, before downloading or parsing anything, a task is created which (if the server does not run lots of other jobs) is submitted for execution. The HTTP connection is closed immediately and the client does not need to wait for anything. No Timeouts are expected. Except if... the machine can't take it because some task running in the background consumes all resources.

@vedina
Copy link

vedina commented May 28, 2011

Did some tests with this dataset

public void readRDF() {
Model jenaModel = ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM);
long mem0 = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
System.out.println("Memory used: " + mem0/1024 + " K bytes");
long now = System.currentTimeMillis();

    jenaModel.read("http://apps.ideaconsult.net:8080/ambit2/dataset/585036",null);
    long mem1 = Runtime.getRuntime().totalMemory() -  Runtime.getRuntime().freeMemory();
    System.out.println("Memory used for Jena object " + (mem1 - mem0)/(1024) + " K bytes");
    System.out.println("Dataset read in "+ (System.currentTimeMillis() - now) + " ms");
}

Printout from the code above, when using OWL model

Model jenaModel = ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM);
Memory used: 3622 K bytes
Memory used for Jena object 245429 K bytes
Dataset read in 144273 ms

Printout from the code above, when using non-OWL mode

Model jenaModel = ModelFactory.createDefaultModel();
Memory used: 1358 K bytes
Memory used for Jena object 243377 K bytes
Dataset read in 108253 ms

At the worst case it is 245MB in memory, not in anyway close to 2.5 GB .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants