Request Timeout when running crossvalidation #14

alphaville · 2011-05-23T23:39:49Z

Occasionally, a Request Timeout exception is thrown (see for example http://toxcreate2.in-silico.ch/task/4332 ). I have the feeling that this happens under heavy load on the server. I've checked the response times at opentox.ntua.gr:8080 and they remain low (see http://ambit.uni-plovdiv.bg/cgi-bin/smokeping.cgi?target=NTUA and http://opentox.ntua.gr:8080/monitoring).

mguetlein · 2011-05-24T07:30:18Z

Pantelis, this timeout is thrown at your service. You should check your web-server error and access logs.

alphaville · 2011-05-25T18:01:36Z

I think this is of some interest: http://opentox.ntua.gr/index.php/blog/76-rdf-opentox-discussion?showall=1&limitstart= - Especially the last paragraph about RDF vs ARFF. It explains the timeout. Could you provide ARFF along with RDF for datasets?

mguetlein · 2011-05-25T21:15:33Z

the timeout in this task (http://toxcreate2.in-silico.ch/task/4332) happens during a simple get request to your service:

 rest_params: 
    :headers: 
      :accept: application/rdf+xml
      :subjectid: AQIC5wM2LY4SfcyXpalQEtoyjxZzHhZIMARV18Unjdb27k8=@AAJTSQACMDE=#
    :payload: 
    :rest_uri: http://opentox.ntua.gr:8080/model/9b84be8c-87d6-4405-aad8-bd7cfc81251e

so, this should have nothing to do with rdf parsing.

not sure if we will provide arff. It should be not too much of an effort, but I think christoph and nina planned to replace rdf owl-dl with a fixed datamodel, represented for example in jason.

alphaville · 2011-05-26T09:24:15Z

Did you read the blog? RDF parsing consumes 2.79GB compared to 1MB for ARFF. (see http://opentox.ntua.gr/index.php/blog/76-rdf-opentox-discussion?showall=1&limitstart= ). It consumes all the RAM of the server and it starts using swap and it responds too slowly.

mguetlein · 2011-05-26T09:30:25Z

Sorry, Pantelis I did not read the blog completely.

But the timeout occurs during a simple get request to your model. Do you parse a big rdf file when a get request to an existing model is performed?

mguetlein · 2011-05-26T09:30:44Z

No description provided.

mguetlein · 2011-05-26T09:54:28Z

One more thing Pantelis:
I do think that rdf scalablity issue is a severe problem, and we have to solve it, and and it is good that do you this investigations.
But IMHO this should still never cause timeouts during the model building process. This is what tasks are for. First the model building service return the task to the client. Then it starts proceeding the rdf data.

alphaville · 2011-05-26T18:50:49Z

Yes, that's right. First a task is created with status QUEUED. Up to that time nothing happens. After that, and unless there not more than 2 other tasks running on the system, the task is submitted to the execution pool. Then it starts downloading and parsing stuff and stuffs the memory with RDF triples... and then the system hangs and crawls ;) Any other running tasks hang too! Even the apache server is dead at that point. If you stand in front of the screen of this computer you're hardly able to move the mouse pointer. The reason is because the whole RAM is occupied and in some cases even half of the swap space!!! The same holds for GET on /task/id. Therefore, it is a matter of RDF scalability.

mguetlein · 2011-05-26T19:46:15Z

;-))) very nice description. I see. We should enforce the scalability issue on the mailing list...

vedina · 2011-05-27T17:35:50Z

the task is submitted to the execution pool. Then it starts downloading and parsing stuff and stuffs the memory with >RDF triples... and then the system hangs and crawls ;)

It seems like the downloading and parsing is done on one go. If so, will be less blocking, if the task is accepted and the task URI is returned immediately. Then the download starts and writes data into a file,and only upon completion is parsed into RDF.

alphaville · 2011-05-27T20:44:35Z

The task is returned immediately to the client. This is the first action, before downloading or parsing anything, a task is created which (if the server does not run lots of other jobs) is submitted for execution. The HTTP connection is closed immediately and the client does not need to wait for anything. No Timeouts are expected. Except if... the machine can't take it because some task running in the background consumes all resources.

vedina · 2011-05-28T13:41:41Z

Did some tests with this dataset
public void readRDF() { Model jenaModel = ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM); long mem0 = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory(); System.out.println("Memory used: " + mem0/1024 + " K bytes"); long now = System.currentTimeMillis();


    jenaModel.read("http://apps.ideaconsult.net:8080/ambit2/dataset/585036",null);
    long mem1 = Runtime.getRuntime().totalMemory() -  Runtime.getRuntime().freeMemory();
    System.out.println("Memory used for Jena object " + (mem1 - mem0)/(1024) + " K bytes");
    System.out.println("Dataset read in "+ (System.currentTimeMillis() - now) + " ms");
}

Printout from the code above, when using OWL model
Model jenaModel = ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM); Memory used: 3622 K bytes Memory used for Jena object 245429 K bytes Dataset read in 144273 ms

Printout from the code above, when using non-OWL mode
Model jenaModel = ModelFactory.createDefaultModel(); Memory used: 1358 K bytes Memory used for Jena object 243377 K bytes Dataset read in 108253 ms

At the worst case it is 245MB in memory, not in anyway close to 2.5 GB .

alphaville closed this as completed May 24, 2011

mguetlein reopened this May 26, 2011

mguetlein closed this as completed May 26, 2011

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request Timeout when running crossvalidation #14

Request Timeout when running crossvalidation #14

alphaville commented May 23, 2011

mguetlein commented May 24, 2011

alphaville commented May 25, 2011

mguetlein commented May 25, 2011

alphaville commented May 26, 2011

mguetlein commented May 26, 2011

mguetlein commented May 26, 2011

mguetlein commented May 26, 2011

alphaville commented May 26, 2011

mguetlein commented May 26, 2011

vedina commented May 27, 2011

alphaville commented May 27, 2011

vedina commented May 28, 2011

Request Timeout when running crossvalidation #14

Request Timeout when running crossvalidation #14

Comments

alphaville commented May 23, 2011

mguetlein commented May 24, 2011

alphaville commented May 25, 2011

mguetlein commented May 25, 2011

alphaville commented May 26, 2011

mguetlein commented May 26, 2011

mguetlein commented May 26, 2011

mguetlein commented May 26, 2011

alphaville commented May 26, 2011

mguetlein commented May 26, 2011

vedina commented May 27, 2011

alphaville commented May 27, 2011

vedina commented May 28, 2011