Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 54
Posts: 54   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 39085 times and has 53 replies Next Thread
I need a bath
Senior Cruncher
USA
Joined: Apr 12, 2007
Post Count: 347
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
angry Re: Tasks are not checkpointing proporly

Is there any chance that some change that was made in the Windows beta version of CEP2 was also ported to the Linux version? It seems as if a lot of us who've been crunching CEP2 for quite a while on Linux are now experiencing problems that we didn't used to have.


Hmmm
I have 1 year 132 days on this project with a computer that has the same exact setup. Only NOW do I have issues. NOTHING has changed except possilbly Ubuntu update.
----------------------------------------

[Oct 4, 2010 10:20:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Tasks are not checkpointing proporly

Is there any chance that some change that was made in the Windows beta version of CEP2 was also ported to the Linux version? It seems as if a lot of us who've been crunching CEP2 for quite a while on Linux are now experiencing problems that we didn't used to have.

That I think would require a version control failure... how likely is that? Nope, it's still the same 6.19 for Linux. I've stuck to the 2.6.32 kernel for Linux, which is 10.04.1 LTS (Long Term Support). The ultra minor update is now on something like 2.6.32.25. CEP2 keep on churning fine, max 2 of 4 cores, with HPF2, HCMD2, C4CW on the side. Flip flopping the device profile once a day incrementing the cache slowly to get the right mix, then set cache back to 1 day. It's reasonable handleable as the run times are now in general 8.5+ hours (still 30-40 minutes overhead per CEP2 task). Don't need to buffer too many to get through a couple of days and since my client is set to jump the repairs ahead on top, seem to be staying with average return times below 2 days... to get more repair work. Have the impression there are not very many R++ devices on Linux, as the ratio is < 4 day deadlined tasks is nearing 20%, a mix of selected sciences, but mostly CEP2, which then run fine...old Q6600, stock speed.

edit: both the currently running CEP2 jobs are repairs ;O)
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 5, 2010 5:50:04 AM]   Link   Report threatening or abusive post: please login first  Go to top 
NightBlade
Advanced Cruncher
Joined: Jun 10, 2008
Post Count: 89
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Tasks are not checkpointing proporly

What are repair WUs?
----------------------------------------

[Oct 5, 2010 10:29:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 2955
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Tasks are not checkpointing proporly

Repair WU's are simply replacement WU's that are sent out to a known reliable host (see the FAQ's for details), to replace WU's that either haven't been returned in time, errored out/aborted, or are verification WU's.
----------------------------------------

[Oct 5, 2010 10:37:59 AM]   Link   Report threatening or abusive post: please login first  Go to top 
I need a bath
Senior Cruncher
USA
Joined: Apr 12, 2007
Post Count: 347
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Tasks are not checkpointing proporly

do the techs know about this problem? How do I find a log of what's going on to send to them?
----------------------------------------

[Oct 5, 2010 4:45:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Tasks are not checkpointing proporly

The techs look at there their periodic reports to see what stands out. What logs to send? They have the result log of course, and any of the stdxxxxxx.txt log files that hold pointers are of interest. Maybe the Linux event log shows something, but don't ask me presently how to get there. Not done this or researched myself.

Maybe entirely unrelated saw a bug report on NVidia driver 260.19.06 and all GPU tasks failing. My host got them, but all GPU crunching functions are disabled in the the cc_config.xml
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Oct 5, 2010 5:35:12 PM]
[Oct 5, 2010 5:19:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
I need a bath
Senior Cruncher
USA
Joined: Apr 12, 2007
Post Count: 347
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Tasks are not checkpointing proporly

what should stand out is that a lot of jobs are reporting finished in about the third of the cpu time that they should be. I think maybe the checkpoints are ok but for some reason the cpu time for some of the intervals is reset back to the last checkpoint. It can't be that the project keeps starting over or it would never finish. There is just a huge (and I mean hours and hours) of time differential between elapsed and reported cpu time. If folks dont mind, or don't notice that they are only getting credit for 3-4 hours for jobs that ran over 8 hours, this might not be noticed. But the folks who noticed are complaining. I don't know what is causing it. I didn't change boinc versions or anything.
----------------------------------------

[Oct 5, 2010 5:40:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Tasks are not checkpointing proporly

Post a full Result log of such a result with the relative CPU and Elapsed times annotated. If time gets lost, something that was reported early in the project, techs aware, then the actual CPU times as best monitored from BOINCTasks or an old BOINC Manager, 6.2.28 or earlier, should display skipping back and % progress retreat, which checkpoint_debug logging switched on.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 5, 2010 5:48:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
I need a bath
Senior Cruncher
USA
Joined: Apr 12, 2007
Post Count: 347
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Tasks are not checkpointing proporly

ok so where do I find the Result Log? and also how do I switch on checkpoint_debug logging?
Thanks
----------------------------------------

[Oct 5, 2010 5:53:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Tasks are not checkpointing proporly

Result logs you find by clicking on the links in the Status column of the Result Status page.

<checkpoint_debug> logflag has to be added to the cc_config.xml file. By default that file already exists on Linux in the /var/lib/boinc-client directory for Lucid. Don't know where is goes for other distros. How to FAQ: http://boinc.berkeley.edu/wiki/Cc_config.xml
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 5, 2010 6:03:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 54   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread