CouchDB problems with Xcode CI

We've got massive problems using Xcode CI in our project. After creating and configuring a simple bot the CouchDB process beam.smp, which is part of the Xcode CI system, goes amok and causes a constant CPU load of about 200% (even without a running integration).

Which causes all sorts of problems and eventually all consecutive integrations to fail. The Xcode server configuration then sometimes shows the message that the configuration cannot be read.


Restarting all Xcode server processes only helps for a short amount of time until the CouchDB process goes amok again.


Anyone ever experienced something similar? We're a bit lost here, we've tried to set the log level of CouchDB to debug but couldn't find the actual cause of the problem.

Any help would be much appreciated.


macOS 10.12.4

macOS Server 5.3

Xcode 8.3

Replies

Came here to say "Me too!" and give some stern looks! 😠👿


Nah, you guys rock, keep up the good work.

We are experiencing the same problems. We use just one bot. Absolutely useless.

Apple, where can we send the bill to for all the wasted time?

"Me too" for the first two points.

We also had to turn off bots completely, as the Mac Mini run out of RAM. We upgraded it with 16 GB and turned them on again. We still have to see if the UI tests will run again, but we now have again the problem that the fans are running for 15 mins, being quiet for 2 and then turn on again. When no solution is found, we have to turn the bots off again, as the noise is unbearable (our office only has one room, we can't move the Mac to anywhere else).

After some digging into Xcode Server 's node.js application, i believe i found a work-around solution. At least, it fixed everything for us: both the CPU usage problems with beam.cmp and node processes, and the trouble with devices and simulators being intermittently lost and unavailable for testing.


I wrote up a detailed description here (link broken deliberately to avoid this reply getting stuck in moderation queue):

h t t p s : // github.com/juce/xcs-tweaks


The quick summary is this:

It is an expensive operation for Xcode Server to retrieve information about available devices and simulators from CouchDB. So to speed things up, it caches that info in Redis. However, for whatever reason, reading that info back from Redis is even slower than querying CouchDB for it. It is during those reads from Redis that a node process starts using 100% cpu. This slow read causes timeouts, which in turn makes Xcode Server API unresponsive, which cascades to failing integrations, and causes problems when you try to edit a bot. So, the fix is to cache on a file on disk, instead of Redis.


For that you need to patch this file:

/Library/Developer/XcodeServer/CurrentXcodeSymlink/Contents/Developer/usr/share/xcs/xcsd/classes/deviceClass.js

with this diff:

h t t p s : // github.com/juce/xcs-tweaks/blob/master/xcs-devices-patch.diff


This worked for us, and i'm hoping it will help others too.

ajouline, you saved us a lot of time.


We were ready to try to downgrade to Xcode 8.3.1 when I saw your post on this thread. I applied the patch and I can safely say that It works for us too. However, just to let you know, the patch command failed on the `Hunk #2 FAILED at 142.` . We fixed it manually, not a big problem.


Thanks for sharing this work-around.

Glad to hear it helped!


Thanks for pointing out the issue with the diff. Looks like it was due to some extra logging statements i added during early troubleshooting. I fixed that, and the diff should apply cleanly now.

As described in the other posting your fix really helped to solve our problems (at least for now).

So the whole CPU load problem seems to be related to the simulator / device list problem.

You are a genious. Thank you.

Now it is Sever 5.3.1. Is this issue solved?

New macOS Server versions unfortunately can't fix the problem since Xcode Server is the problem which is part of the Xcode bundle.

Hi Brent,


thanks for keeping on top of this.


We have the same issue, but even more amazing is that our test target is macOS, so there literally are no "devices" to look for.


I have a xcsdiagnose file, but I don't think Radar will allow me to upload a 1.6 GB tarball. What is the recommended way to get such large files to you?

Xcode 8.3.3 claims to have fixed this issue.


Fixed issues with Xcode Server that caused excessive CPU usage. (31874759)


https://developer.apple.com/library/content/releasenotes/DeveloperTools/RN-Xcode/Chapters/Introduction.html

This fix works like a charm @ajouline ! Amazing, Thank you for the solution 🙂


Although the patch command didnt work for me (using Xcode 8.3.3)

I manually updated lines of code in deviceClass.js looking at your diff file

After updating to Xcode 9 on the server, beam.smp takes again over 100% CPU (for several hours now, doesn't seem to calm down).

The patch from this thread https://forums.developer.apple.com/thread/76450 can not be applied to the new xcs_devices.json (and I'm not even sure if this would resolve the issue, haven't tracked down the cause of this issue).

Anyone else observing the same? Any insights on this? Any workaround / solution?

Please file a new bug and attach the output of:


sudo xcrun xcsdiagnose


Thank you!