[Thali-talk] Performance testing and release criteria for BLE on Android

Fri Aug 21 12:11:34 EDT 2015

Yup. My description was wrong. StopBroadcasting shouldn't happen until the coordinator tells the phones to do it. We can either do that as a side effect of getting a 'do set up for test' message or as a separate message. Your call.

And per our phone discussion today you can just send a string of 0s and have the other end send occasional ACKs. The idea is to put the data pressure only one way, not two. Although eventually we should measure bi-directional bandwidth but one step at a time.

     Thanks!

           Yaron

From: Jukka Silvennoinen
Sent: Friday, August 21, 2015 2:26 AM
To: Yaron Goland <yarong at microsoft.com>
Cc: thali-talk at thaliproject.org
Subject: RE: Performance testing and release criteria for BLE on Android

I would have one issue and one question on the plan.

Issue: With tests, you actually would need to coordinate the ending of tests. i.e. in each tests Phone should indicate to the coordinator when they have finished, and not call StopBroadcasting until Coordinator tells to do so.
This is absolutely needed, since if we don't do that, faster phones could shut down their advertisement before slower one are seeing them, meaning that the tests for the slow devices would fail.

Coordinator -> Phones: Run setup for test X

Coordinator<-Phones: Setup completed

Coordinator->Phones: Start test X

Coordinator<-Phones: Test X completed

Coordinator->Phones: End test X (phone now, calls StopBroadcasting)
Coordinator->Phones: Kill test X (used when the total time period to repeat the test runs out)

Question:

With "During the test each phone would connect to the other phones and send Z bytes of data. Z would be a very large number, say 1 gig. During the test if the connection failed for any reason the phone would automatically reconnect and continue sending any remaining data. The test ends when Z bytes have been successfully sent to all N-1 peers"

Which part of the system is handling the data counting & reconnection, and how ?

-jukka-

From: Yaron Goland
Sent: 20. elokuuta 2015 23:42
To: Jukka Silvennoinen <jukka.silvennoinen at microsoft.com<mailto:jukka.silvennoinen at microsoft.com>>
Cc: thali-talk at thaliproject.org<mailto:thali-talk at thaliproject.org>
Subject: Performance testing and release criteria for BLE on Android

Note: This is going to the public mailing list

(Jukka, there is a request for you at the end of this mail)

At some point I'll release a blog article explaining the details but for those who don't know we have given up on Wi-Fi Direct Service Discovery. It's just too unreliable. So our plan is to switch to BLE and then based on customer interest we can potentially create a discovery mechanism based on Wi-Fi Direct Peer Discovery.

Our goal is to release the BLE based discovery on Android in story 0.0 but we don't want a repeat of story 0 where we released something that really didn't work right.

To avoid that we need to put in place a number of performance tests.

Performance Test #1 - Discovery Perf - We want to run tests on N phones where each phone starts the test at the same instant, calls StartBroadcasting (this is all in Node) and measures how long it takes until it finds the other N-1 phones. Once each device has found the other N-1 phones then the phone would call StopBroadcasting and test cycle would start again. The test would then keep repeating for a specified period of time.

Performance Test #2 - Re-Connection Perf - We want to run tests on N phones. During setup each phone would call StartBroadcasting and wait until it had discovered the other N-1 phones. Once all the phones had successfully discovered the other phones then the test would start. During the test each phone would connect to the other phones and send B bytes of data, disconnect and re-connect again and send B bytes of data. This connect/disconnect cycle would repeat X times. B should be a tiny number since we are really just trying to measure how long disconnect/connect cycles take. Once X repetitions had completed across all the phones then that test cycle would be completed, the phone would call StopBroadcasting and we would start the test from scratch. We would keep repeating the test cycles for a specified period of time.

Performance Test #3 - Average Data Transfer Speed - We want to run tests on N phones. During setup each phone would call StartBroadcasting and wait until it had discovered the other N-1 phones. Once all the phones had successfully discovered the other phones then the test would start. During the test each phone would connect to the other phones and send Z bytes of data. Z would be a very large number, say 1 gig. During the test if the connection failed for any reason the phone would automatically reconnect and continue sending any remaining data. The test ends when Z bytes have been successfully sent to all N-1 peers. Once the test is over the phone would call StopBroadcasting and start the test from scratch. We would keep repeating the test cycles for a specified period of time.

Test Infrastructure - The key above is making sure that the tests all start at the same time and collecting results from all the devices. Our current thinking is that we would have a Node.js server running on a PC. It would act as the coordinator and would also collect the performance results. What we would do is have a JSON file in the Cordova test project that would specify the IP address/Port of the coordinator server as well as an integer indicating how many devices are supposed to be in the test. When the test code runs on each phone it will read that file. The phone could then use a pure JS library (I think it's pure) like WS to make a websocket call to the coordinator. We could then use that connection for the phone to indicate when it is ready. The coordinator server would then wait to receive N "ready" signals over N web socket connections and could then simultaneously send out N 'start' signals. It would then wait to receive N done signals, record how much time passed between when the coordinator sent the start signal to all the devices and when it received the last done signal from all of the devices. It would record the delta as the result for that test (e.g. one result for the test across all the devices) and then send a new start signal to repeat the test. This 'done-start' process would repeat until the pre-defined time limit for the test expired. When the test time expires the coordinator would send a signal to all the phones to kill their existing test run (if any) and start the next set of tests.

In other words:

Time 0 - The phones boot up and start running Test 1 which requires them to immediately connect to the coordinator server to announce that they are read.

Time 1 - The coordinator server hears from N phones (it gets N from the same config file as everyone else) and sends out a signal to run test 1.

Time 2 - The phones run test 1 (e.g. discover the other N-1 phones) and send signals to the coordinator server once they finish discovery.

Time 3 - The coordinator server receives N "completed test 1" signals. The coordinator server then calculates the total time the test instance ran (in this case 3 - 1 = 2) and records that in an array. Then the coordinator server sends a signal to run test 1 again.

This process repeats over and over until time M, which was the total time we would to let test 1 run over and over again. By time M the coordinator has an array of test results from each of its test 1 runs across the devices.

Time M - The coordinator sends out a kill signal and then says to start set up for test 2.

Time M+1 - All the devices indicate they have completed setup for test 2.

Time M+2 - The coordinator server receives N "set up completed" signals and so sends out a 'start test 2' signal.

Etc.

One suspects it would make the most sense to just have a standard signal set up of:

Coordinator -> Phones: Run setup for test X

Coordinator<-Phones: Setup completed

Coordinator->Phones: Start test X

Coordinator<-Phones: Test X completed

Coordinator->Phones: Kill test X (used when the total time period to repeat the test runs out)

Perf Results - After each test run the coordinator should record how long it took from the start signal was received until the last completed signal was received from the devices. The coordinator will then create an array of times for all the test runs for a particular test. When the elapsed time for all the test runs has completed then the coordinator server would print four pieces of information. The first three pieces of information are the 90th, 95th and 100th percentile results from the array of test results. This is calculated by taking the test rules, ordering them from fastest to slowest and then calculating the 90th and 95th positions in the array as well as the last position (100th) and outputting those to console.log on the PC. The fourth piece of information is the elapsed time since the last test started and when it was killed due to the total time allotted for all the tests expiring. In other words imagine we are running test #1. We have allotted 10 minutes for all the test runs. We have repeated the test say 1000 times. The 1001st test run of Test #1 starts at minute 8. At that point one of the devices has a failure and won't ever complete the test. So at minute 10 the coordinator sends out a signal telling everyone to abandon the test. The coordinator would then calculate the 90th/95th/100th percentile results for all the completed tests and it would output a fourth line on console.log specifying that the last test run, which didn't have time to finish, took 2 minutes. This will help us detect when we have serious failures.

Test Set up - As specified here<https://github.com/thaliproject/Thali_CordovaPlugin/blob/story_0/TeamReadMe.md#unit-testing-the-thali-cordova-plugin> one can set up our testing environment by calling jx npm test from ThaliCordovaPlugin. This creates a project called ThaliTest that is configured to run our tests. We need to introduce a new script to our package.json called testPerf. It should call 'test' and then it should create some kind of flag file in www/jxcore in ThaliTest that tells the environment that we are running perfTests.  It should also start up a local jx instance running the coordinator JS file. When ThaliTest is deployed to the phones the phones will see the flag file and know they are in perfTest mode. At that point they can pick up the JSON config file and find out the IP address/Port of the coordinator server and also how many devices they should expect to work with.

Jukka, please let me know if this all makes sense to you. If so we absolutely must run test #1 before we integrate BLE and we really need to run tests #2 and #3 before we release story 0.0. What's particularly nice about these tests is that because they are written in Node.js we can use them, as is, to run tests against iOS as well!

So, would you be willing to take this on as part of the BLE work?

       Thanks,

            Yaron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist10.pair.net/pipermail/thali-talk/attachments/20150821/95096c82/attachment-0001.html>