Note: This is mostly put together from a Twitter thread I wrote last night, for easier reading and with a bit more code for context.
TraceTogether is a mobile app developed by Singapore’s Government Technology Agency (GovTech) that performs distributed contact tracing for the purposes of informing people if they have come into close contact with somebody that has been infected with COVID-19. It’s been all over the news so you can Google about it for more info.
I took a look at TraceTogether over the weekend, to see what I could find. This is still very preliminary, but since there’s a lot of chatter about it, I decided to write something up on whatever I have so far. I’ve not had a lot of time to work on it though, so some parts (or maybe all of it) might be wrong. Let me know!
First Look
I only did a static analysis of the code, because that’s what I’m better at, and because I didn’t want to spend the time setting up a phone to debug on. I used
apktool and JEB to open up the sg.gov.tech.bluetrace
APK and have a look. TheAndroidManifest
looked normal, nothing weird there. In JEB,
there were several classes, mostly UI related. The code was written in
kotlin, which means it would look messier when decompiled to Java,
as JEB does. There was a package gov.tech.bluetrace.streetpass
, which had some database code, and is probably where the records are kept. (There’s a Medium post by Frank L that goes into more detail about the database.)
What I did notice immediately, though, was a huge amount of obfuscated classes. I’ll talk more about this later.
Location Privacy
I decided to first verify something that a lot of people have been wondering about: does the app collect location info? GovTech has been assuring everybody it doesn’t, but we best check the code to confirm this.
I started by grepping for any part of the code that uses LocationManager
, and found relatively few hits.
$ grep -RF "LocationManager" *
o/COn.smali:.field private final ı:Landroid/location/LocationManager;
o/COn.smali:.method constructor <init>(Landroid/content/Context;Landroid/location/LocationManager;)V
o/COn.smali: iput-object p2, p0, Lo/COn;->ı:Landroid/location/LocationManager;
o/COn.smali: iget-object v0, p0, Lo/COn;->ı:Landroid/location/LocationManager;
o/COn.smali: invoke-virtual {v0, p1}, Landroid/location/LocationManager;->isProviderEnabled(Ljava/lang/String;)Z
o/COn.smali: iget-object v0, p0, Lo/COn;->ı:Landroid/location/LocationManager;
o/COn.smali: invoke-virtual {v0, p1}, Landroid/location/LocationManager;->getLastKnownLocation(Ljava/lang/String;)Landroid/location/Location;
o/ȷ.smali: check-cast v3, Landroid/location/LocationManager;
o/ȷ.smali: invoke-direct {v2, v1, v3}, Lo/COn;-><init>(Landroid/content/Context;Landroid/location/LocationManager;)V
o/x.smali: check-cast p0, Landroid/location/LocationManager;
o/x.smali: invoke-virtual {p0, v0}, Landroid/location/LocationManager;->isProviderEnabled(Ljava/lang/String;)Z
o/x.smali: invoke-virtual {p0, v5}, Landroid/location/LocationManager;->isProviderEnabled(Ljava/lang/String;)Z
o/x.smali: invoke-virtual {p0, v2}, Landroid/location/LocationManager;->getProviders(Z)Ljava/util/List;
o/x.smali: invoke-virtual {p0, v0}, Landroid/location/LocationManager;->getLastKnownLocation(Ljava/lang/String;)Landroid/location/Location;
The function getLastKnownLocation()
is being called by several parts of the code. This function returns the last known location of the device. I checked to see why the code was calling this function, and found both cases were (more or less) benign.
In the first case, the location information was used to determine latitude and longitude, in order to calculate sunrise and sunset timings, so the app could turn on night mode at night.
Location v3 = Ӏ.ι(v1.Ι, "android.permission.ACCESS_COARSE_LOCATION") == 0 ? v1.get_location("network") : null;
if (Ӏ.ι(v1.Ι, "android.permission.ACCESS_FINE_LOCATION") == 0) {
v4 = v1.get_location("gps");
}if (v4 != null && v3 != null) {
if(v4.getTime() > v3.getTime()) {
v3 = v4;
}
} else if(v4 != null) {
v3 = v4;
}if(v3 != null) {
o.COn.ǃ v1_1 = v1.ǃ;
long v4_1 = System.currentTimeMillis();
if(Ӏ.ı == null) {
Ӏ.ı = new Ӏ();
}
Ӏ v6 = Ӏ.ı;
// The function call below calculates twilight.
v6.ǃ(v4_1 - 86400000L, v3.getLatitude(), v3.getLongitude());
v6.ǃ(v4_1, v3.getLatitude(), v3.getLongitude());
if(v6.Ι == 1) {
v7 = true;
}// ...
v1_1.ι = v7;
v1_1.Ι = v16_1;
return v2.ι ? 2 : 1;
}int v1_2 = Calendar.getInstance().get(11);
if(v1_2 < 6 || v1_2 >= 22) {
v7 = true;
}return v7 ? 2 : 1;
The function being called above (v6.ǃ
) appears to be calculateTwilight()
from here. You can see in the case where location info is unavailable, v7
is set to true if the hour is before 6 or after 22, i.e. nighttime.
In the second case, location info was gathered for analytics purposes. They’re using something called Snowplow, which I’m not familiar with.
Location v14 = x.get_location(this.І); // this calls the function that uses getLastKnownLocation().
if(v14 == null) {
v14_3 = null;
} else {
HashMap v1 = new HashMap();
Double v2 = (double)v14.getLatitude();
if(v2 != null) {
v1.put("latitude", v2);
}
Double v2_1 = (double)v14.getLongitude();
if(v2_1 != null) {
v1.put("longitude", v2_1);
}
Double v2_2 = (double)v14.getAltitude();
if(v2_2 != null) {
v1.put("altitude", v2_2);
}
Float v2_3 = (float)v14.getAccuracy();
if(v2_3 != null) {
v1.put("latitudeLongitudeAccuracy", v2_3);
}
Float v2_4 = (float)v14.getSpeed();
if(v2_4 != null) {
v1.put("speed", v2_4);
}
Float v14_1 = (float)v14.getBearing();
if(v14_1 != null) {
v1.put("bearing", v14_1);
}
Long v14_2 = (long)System.currentTimeMillis();
if(v14_2 != null) {
v1.put("timestamp", v14_2);
}
v14_3 = x.ǃ(v1, new String[]{"latitude", "longitude"}) ? new m("iglu:com.snowplowanalytics.snowplow/geolocation_context/jsonschema/1-1-0", v1) : null;
As you can see from the code above, the location info is packaged into a HashMap, which is stored somewhere. I’ve not verified when exactly the location info is gathered, and if or when it is sent out.
Now, you could consider this a problem, but I think having analytics like this is standard issue in many apps nowadays? It’s also not really something I would consider explicitly malicious. I’ve also heard from people who’ve used frida to hook on getLastKnownLocation()
that it doesn’t seem to be called that often.
In summary: GovTech isn't lying. TraceTogether doesn't save your location, or send it to a server. But it has an analytics component that might, in particular scenarios (e.g. when the app crashes).
[Update, 01 Apr 2020: Kevin Chu took a deeper dive into the analytics being used in TraceTogether and convinced the developers to remove it in future versions.]
Obfuscation Galore
After sorting that out, I moved on to look at the rest of the code. As mentioned above, a lot of the code is heavily obfuscated, and the obfuscation was more than I’m used to seeing in an Android app. Besides renaming classes and packages and encrypting strings, it also messes up the call graph by using a function that dynamically dispatches method calls based on some string values.
try {
mj.if.logD("StartOnBootReceiver", "Attempting to start service");
v4 = ((Class)xxx.obfcall(4, 11, '爒')).getField("ι").get(null);
}
catch(Throwable v10) {
goto label_48;
}try {
((Class)xxx.obfcall(4, 11, '爒')).getMethod("ı", Context.class, Long.TYPE).invoke(v4, arg10, ((long)500L));
return;
}
catch(Throwable v10_1) {
}
The function, which I’ve renamed obfcall
above, is large and fairly complex. It takes in 3 arguments, 2 integers and a character. Here’s a condensed snippet of it here.
byte v8_4 = (byte)(((v8_3 | -1) << 1) - (v8_3 ^ -1));
try {
v8_5 = xxx.$$c(v8_4, 0x204, 842);
v10_3 = xxx.memb4;
}
// ...
try {
label_75:
v8_6 = Class.forName(v8_5, true, ((ClassLoader)v10_3));
v10_4 = xxx.memb9[1];
}
// ...
byte v10_5 = (byte)(v10_4 + 1);
xxx.memb11 = (((xxx.mem6 | 71) << 1) - (xxx.mem6 ^ 71)) % 0x80;
try {
v10_6 = xxx.$$c(v10_5, 617, 0x20);
v3 = new Class[]{Integer.TYPE, Integer.TYPE, Character.TYPE};
}
// ...
try {
return v8_6.getMethod(v10_6, v3).invoke(v1, v4);
label_113:
Object v8_7 = v8_6.getMethod(v10_6, v3).invoke(v1, v4);
super.hashCode();
return v8_7;
}
The function first performs some convoluted arithmetic on its 3 arguments before deriving a string v8_5
. This string is used as the name of a class to be loaded using Class.forName()
, which results in the class object v8_6.
Note that since class names have been obfuscated, v8_5
is probably just a few characters long. The method name is derived similarly into v10_6
, and the finally the method is invoked from the class.
From some cursory research, this obfuscator is probably DexGuard, a commercial version of ProGuard. Obfuscation is security through obscurity; it’s definitely possible to reconstruct the call graph and determine the actual function being called for each invocation of obfcall()
. In fact most calls can be trivially mapped with a bit of dynamic analysis. But, it’s hella irritating, and will take a while. I’ve decided not to spend time reversing that, especially since it seems the source will be available soon.
Unfortunately, a lot of the core functionality of TraceTogether seems to be in the obfuscated region, and blocked by the aforementioned obfuscation techniques. For example, the function called on boot is protected by the obfcall()
function. This makes static analysis more time-consuming, and incomplete without deobfuscation.
Native library obfuscation
TraceTogether also comes with a native library, libb.so
. I had a look at it in IDA, and it turns out it uses obfuscation as well, mostly on strings. For example, the string “/proc/%d/status” is obfuscated using a kind of substitution cipher based on the previous character.
// obfs_proc_d_status =
// "f\x9f\xe2\xe1\xd2\x92T\x89\x93\xa2\xe7\xd5\xd5\xe9\xe8s"
if ( !(byte_222F7 & 1) ) {
v1 = 0LL;
v2 = 0;
v3 = 55;
while ( 1 ) {
while ( 1 ) {
v4 = v2 & 3;
if ( v4 != 1 )
break;
byte_222F7 = 1;
v2 = 2;
}
if ( v4 )
break;
v3 = obfs_proc_d_status[v1] - v3;
obfs_proc_d_status[v1++] = v3;
v2 = v1 == 16;
}
if ( v4 != 2 ) {
while ( 1 )
;
}
}
sprintf(v20, obfs_proc_d_status, pid);
I deobfuscated several of the strings, and they were mostly directory paths and some other kinds of constants. Most innocuous, but I’ve not deobfuscated every string yet, so I can’t be sure there’s nothing unusual in there.
Why is this obfuscated?
One of my original goals in taking a look at #TraceTogether was to get some assurance that it wasn’t doing anything odd. There was an Iranian coronavirus app released some time ago that appeared to be spyware. Could TraceTogether be spyware too? Personally, I doubt so, but I’m not able to definitely answer that question without deobfuscating the code.
Why is there so much obfuscation in TraceTogether? Is there something it is trying to hide? What exactly is happening inside those functions, and in that native library? To be honest, if I were tasked to do an audit on an app that looked this way, at this point I would tell the customer to hold off on installing it, till I managed to reverse it more thoroughly.
However, present circumstances are very different. If a regular Singaporean asked me if they should install TraceTogether, I think I’d still say yes. Why? Well, mostly faith that the people who put this together are not lying, or being malicious. I also think they’d be kind of dumb to subject an app to this level of scrutiny if there was something questionable within it.
Furthermore, if you’re a regular Singaporean: 1. the government is probably not so interested in you that they’d want to backdoor your phone, and 2. you probably have Parking.sg or SingPass mobile installed, in which case you’re already pwned, it’s not going to make a difference. And 3. do you know how many CCTVs there are on this island? I don’t think they need a mobile app to know where you are.
But if you have legit reason to worry about the government monitoring your phone, then yeah I wouldn’t laugh at you for playing it safe for now. (Note: posting copious amounts of anti-PAP tirades on EDMW/reddit doesn’t count as a legit reason.)
Otherwise, I think the benefit of many people using TraceTogether outweighs the possible risk using it might hold. And now that it is being open sourced, and third party clients are encouraged, there’s even less reason to be suspicious.
All that being said, I do feel that the use of obfuscation in TraceTogether is unnecessary and unhelpful. Especially if you’re going to open source it; things like DexGuard are meant to protect intellectual property, so why use it for code you’re going to release? I took a look at Parking.sg, and it doesn’t appear to obfuscate any of the code. Maybe the specific team in GovTech that developed TraceTogether just likes to obfuscate by default?
I would really recommend they release a new version without obfuscation, even after the source is available. That way, independent auditors can more easily verify that the version available on the Play Store matches up with the source and doesn’t contain anything it shouldn’t. That would put all worries to rest.
Next Steps
That’s about it from the “spyware” angle. My next concern, and one that I feel is more important, is how secure it is. The app implements a BLE server that listens for requests from other phones, and parses them. What kind of data could a malicious phone send to a victim phone? What code is responsible for doing the parsing? Any chance of SweynTooth-style attacks compromising devices? This is what I’m focusing on now. Will update when I find out more. If the source is out by then I’ll probably switch to reading it instead.