Page 3 of 5

Re: Afserver crashes

Posted: Tue Mar 14, 2017 3:29 pm
by selsner
or maybe you can give me a good hint on how to properly attach valgrind so we have more debug info.

Re: Afserver crashes

Posted: Tue Mar 14, 2017 5:08 pm
by timurhai
Hi.
As it crashes on a new thread start, i can say the exact SIGSEGV point in the code:
https://github.com/CGRU/cgru/blob/maste ... d.cpp#L255

But unfortunately for now i can't say why we have some probability to hung there.
And why this probability is differ on different platforms.

Pthread library uses system "clone" call.
For now i do not know what we need to know more and what more can valgrind say, but may be it can help more.

Re: Afserver crashes

Posted: Wed Mar 15, 2017 9:21 am
by timurhai
Hi.

I just wrote (and committed) threads raising test:
https://github.com/CGRU/cgru/blob/maste ... st.cpp#L41
It is an afcmd command that do almost the same as afserver on each new connection.

Strange results:

On my laptop (ubuntu16 gcc 5.4) i can raise 1 000 000 threads. Tried many times - no crashes.
But afserver crashes on this platform on when i spawn 200 afrenders.

At work (ubuntu14 gcc4.8) this test can crash staring from 1000 threads and definitely crash on 100 000 threads.
But i can`t crash afserver on this platform, including crash-tests with 200 afrenders.

I should read about pthread library more. Likely that afanasy uses it some incorrectly.
( note that DlThread class was just taken from 3delight, that is not a server, but render engine )

You are welcome to test this command on your platforms.

Re: Afserver crashes

Posted: Wed Mar 15, 2017 4:53 pm
by selsner
I am not able to get it to crash ....

Re: Afserver crashes

Posted: Wed Mar 15, 2017 5:30 pm
by seven11
@selsner,
Are you on a 32 or 64 bit kernel running the afserver?

Timur,
Why is the stack size for each thread so large? I have been testing an afserver with the thread stack size set to 32768 which reduces the virtual memory usage. I couldn't find where
the stack size was being set so I add this pthread_attr_setstacksize(&attr, 32768); to dlThread.cpp

Have a look at the first answer of this post:
http://stackoverflow.com/questions/5585 ... -of-memory

Scott

Re: Afserver crashes

Posted: Wed Mar 15, 2017 7:17 pm
by seven11
So on my afserver machine each pthread stack uses 10485760 of virtual memory on a stock 2.2.1 afserver.

Here's the patch code if someone would like to try it:

Code: Select all

--- dlThread.cpp	2017-03-15 12:09:05.300428237 -0700
+++ dlThread.cpp.mod	2017-03-15 12:08:45.772250461 -0700
@@ -23,6 +23,7 @@
 /* The following chunk are for GetNbProcessors. */
 #ifdef LINUX
 #include	<sys/sysinfo.h>
+#include <stdio.h>
 #endif
 
 #ifdef IRIX
@@ -246,6 +247,12 @@
 #else
 	pthread_attr_t attr;
 	pthread_attr_init( &attr );
+	pthread_attr_setstacksize(&attr, 32768);
+
+//	size_t size;
+//	int ret = pthread_attr_getstacksize(&attr, &size);
+//	printf ( "Get: ret=%d,size=%u\n" , ret , size ) ;
+
 	if( detached )
 	{
 		pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_DETACHED );

Scott

Re: Afserver crashes

Posted: Wed Mar 15, 2017 9:27 pm
by timurhai
May latest post and commit has a mistake.
DlThread does not check whether the thread was started.
And does return void from Start() function.
We can't raise 1 000 000 threads at the same time.
That test just stops to raise threads at all w/o any error message.
Normally on Linux you can't spawn more than 32k threads, and this is more than enough for us.
Now changed DlThread to return an integer that pthread_create returns.
And added a check in afserver and afcmd test.
https://github.com/CGRU/cgru/commit/583 ... c985e005b3

Also 1ms sleep was added in afcmd raise threads loop.
Afcmd test threads sleeps (life) 1second, so there is 1s/1ms = 1000 thread at the same time, and this is much less than the limit (32k).
Now 1 000 000 test will take some time, but it was passed (on ubuntu16).

I also performed a 200 afrenders test (on ubuntu16), but i did not get any new error message from pthread_create.
Just segmentation fault:
Backtrace:
/lib/x86_64-linux-gnu/libpthread.so.0(pthread_create+0x4ff)[0x7f1a1777ee8f]
afserver(_ZN8DlThread5StartEPFvPvES0_+0xc5)[0x4c96d5]
afserver(_Z16threadAcceptPortPvi+0x509)[0x45f049]

So now i am at the same point as before.

Continue digging this issue...

Re: Afserver crashes

Posted: Wed Mar 15, 2017 10:02 pm
by timurhai
@seven11
Great idea!
I added:

Code: Select all

pthread_attr_setstacksize(&attr, 32768);
And it works for about a half an hour!
It never works so long on my testing system.

Reading more about stack size...

Re: Afserver crashes

Posted: Thu Mar 16, 2017 3:42 pm
by seven11
Great! I'm going to pull down the new afserver code and test it.
Scott

Re: Afserver crashes

Posted: Thu Mar 16, 2017 9:45 pm
by seven11
Timur,
So I'm now running a Git pull of 2.2.2 with the pthread stack size set to 32K on a Centos 6.5 x86_64, 4 Proc, 8Gig Ram.
I can spawn a 1000 threads with the " afcmd tthr 1000" command. No issues so far.
I'll keep you posted...
Scott