New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request Timeout when running crossvalidation #14
Comments
Pantelis, this timeout is thrown at your service. You should check your web-server error and access logs. |
I think this is of some interest: http://opentox.ntua.gr/index.php/blog/76-rdf-opentox-discussion?showall=1&limitstart= - Especially the last paragraph about RDF vs ARFF. It explains the timeout. Could you provide ARFF along with RDF for datasets? |
the timeout in this task (http://toxcreate2.in-silico.ch/task/4332) happens during a simple get request to your service:
so, this should have nothing to do with rdf parsing. not sure if we will provide arff. It should be not too much of an effort, but I think christoph and nina planned to replace rdf owl-dl with a fixed datamodel, represented for example in jason. |
Did you read the blog? RDF parsing consumes 2.79GB compared to 1MB for ARFF. (see http://opentox.ntua.gr/index.php/blog/76-rdf-opentox-discussion?showall=1&limitstart= ). It consumes all the RAM of the server and it starts using swap and it responds too slowly. |
Sorry, Pantelis I did not read the blog completely. But the timeout occurs during a simple get request to your model. Do you parse a big rdf file when a get request to an existing model is performed? |
No description provided. |
One more thing Pantelis: |
Yes, that's right. First a task is created with status QUEUED. Up to that time nothing happens. After that, and unless there not more than 2 other tasks running on the system, the task is submitted to the execution pool. Then it starts downloading and parsing stuff and stuffs the memory with RDF triples... and then the system hangs and crawls ;) Any other running tasks hang too! Even the apache server is dead at that point. If you stand in front of the screen of this computer you're hardly able to move the mouse pointer. The reason is because the whole RAM is occupied and in some cases even half of the swap space!!! The same holds for GET on /task/id. Therefore, it is a matter of RDF scalability. |
;-))) very nice description. I see. We should enforce the scalability issue on the mailing list... |
It seems like the downloading and parsing is done on one go. If so, will be less blocking, if the task is accepted and the task URI is returned immediately. Then the download starts and writes data into a file,and only upon completion is parsed into RDF. |
The task is returned immediately to the client. This is the first action, before downloading or parsing anything, a task is created which (if the server does not run lots of other jobs) is submitted for execution. The HTTP connection is closed immediately and the client does not need to wait for anything. No Timeouts are expected. Except if... the machine can't take it because some task running in the background consumes all resources. |
Did some tests with this dataset
Printout from the code above, when using OWL model
Printout from the code above, when using non-OWL mode At the worst case it is 245MB in memory, not in anyway close to 2.5 GB . |
Occasionally, a Request Timeout exception is thrown (see for example http://toxcreate2.in-silico.ch/task/4332 ). I have the feeling that this happens under heavy load on the server. I've checked the response times at opentox.ntua.gr:8080 and they remain low (see http://ambit.uni-plovdiv.bg/cgi-bin/smokeping.cgi?target=NTUA and http://opentox.ntua.gr:8080/monitoring).
The text was updated successfully, but these errors were encountered: