APK Structure#

This section is a short primer on the Android Package file (APK). It is assumed that the reader is already familiar with the structure of an APK. Good introduction material can be found on the official Android developer portal.

An Android app is a zip file containing application code, data, and metadata.

  • Code: Dalvik (bytecode), Native (*.so libs)
  • Data: resources (structured), assets (unstructured)
  • Metadata= manifest (what), certificates (who)

When JEB processes an APK, the resulting structure in the Project tree will differ from a raw ZIP tree view. The range of differences goes from slightly for regular apps, to significantly for obfuscated or complex apps.

The picture below shows a side-by-side comparison of processing an app as a ZIP file vs processing it as an APK:

  • Manifest: encoded VS decoded
  • Certificates: v1 (visible, in file MANIFEST.MF), v2/v3 (in ZIP) VS parsed certificates
  • Bytecode: split over classes.dex, classes1.dex, ..., classesN.dex VS virtual merged DEX unit
  • Resources: encoded and scattered in resources.arsc, res/, elsewhere (anywhere) VS decoded and reorganized resources
  • Native libs: 1-to-1
  • Assets: 1-to-1

Certificates#

As of Android 11, four types of signatures are in place to sign APKs, versions 1 through 4.

Version 1#

Version 1 is the legacy scheme supported by all versions of Android:

  • standard Jar signing (Oracle)
  • signing data goes in META-INF/
  • each individual file in the archive is signed
    • MANIFEST.MF: list of hashes of all files
    • xxx.SF: hash of hash entries in MANIFEST.MF
    • xxx.{RSA,DSA,...}: signature of xxx.SF + signer certificate (= what JEB displays)
    • note that xxx='CERT', usually
  • The apk/zip itself is not signed: this scheme is both inefficient and incomplete when the goal is to verify the APK as a whole

Versions 2/3#

Versions 2 and 3 are specific to Android:

  • What is signed is the APK as a whole
  • Uses a twist in zip format specifications
  • The global signing block is inserted just before the zip Central Directory (and can be located by looking for a magic number)
  • V3 = V2 + support for key rotation
  • What is displayed in the Certificate fragment is the signer's certificate, just like V1's
  • review the reference documentation for additional details

Version 4#

The APK signature scheme version 4 scheme was introduced with Android 11 (R) to ease development of larger applications. The signature of the APK is done incrementally via a Merkle tree. The signing data is stored separately in an <APKNAME>.idsig file.

Note

Version 4 signatures do not seem to be designed for release purposes. At the moment, JEB does not parse idsig files.

JEB parses v1/v2/v3 signing data. The certificate is displayed as a tree in the UI client:

API

To retrieve this data programmatically: refer to IApkUnit, methods getSignatureSchemeVersionFlags and getSignatureSchemeV{2,3}Block

Manifest#

AndroidManifest.xml defines the Android application to whoever interacts with it, from building, to deployment, to execution.

Important parts of the Manifest:

  • Package name (fully qualified Java name)
  • Requirements to run the app (API level, hardware configs)
  • Permissions required by the app (not all may be granted by the system at t0)
  • Components must be declared in apps - except for Broadcast receivers, which can be registered dynamically
    • Activities (UI elements)
    • Services (background execution)
    • Broadcast Receivers (receive and process events from apps/system)
    • Content Providers (offer data to other apps/system)
  • Declares whether the app is debuggable on a production device <application android:debuggable="false|true" ...

For example, the simple manifest below...

  • declares a app named (internal package name) com.xyz.appcheck
  • requiring at least and ideally API 26 (Android P)
  • wants read+write access to storage
  • the App is debuggable
  • it declares one main activity (visible on launcher)
  • as well as one implicit broadcast receiver

Note

Manifests can be very complex and lengthy. For example, the primary Facebook app (com.facebook.katana) manifest is well over 2000 lines, mostly Activity descriptions.

About Permissions#

Permissions provide an indirect insight into what "functions" an app needs to perform. They are granted by the user at install and/or runtime:

  • Before API 23, permissions were all granted at install time. A pop-up would display which dangerous permission groups are being requested.
  • With API levels 23+, permissions are granular, and dangerous permissions are granted at run-time
    • unless the Manifest declares targetSdk<23!
    • the user will be shown a system pop-up
    • permissions can also be revoked in settings

Permissions - whether related code requiring them is used, whether they are granted explicitly or implicitly -, MUST all be declared in the Manifest, i.e. an app cannot programmatically request a permission that was not declared in the Manifest in the first place.

Structured Resources#

Structured resources of an app consist of XML files (e.g., app layouts, strings, etc.), image files, icons, etc.

  • XML resources are encoded using a binary format called arsc. The manifest, an XML resource, is encoded as well.
  • Common resources’ information goes into the app resources.arsc file
    • Resources references resources.arsc items by id
    • They can also reference Android Framework and other vendor-installed framework resources by id (refer to the section 'Third-party Frameworks')
  • JEB always ships with the latest official Android Framework

Note

For additional information on Resources:

  • high-level information can be found in the official doc
  • lower-level details of the arsc format can be found by going through the main implementation of the encoder and decoder, on the AOSP's platform/frameworks/base repository. The newest ResourceTypes.h is located here.

Oddities and Obfuscation#

Resources on Android can be mangled in several ways. JEB unmangles them to the best of its ability. Below, we briefly describe two commonly found obfuscation techniques.

Name removal#

Resource items are normally identified by a name as well as an id. Several application protectors remove resource names from the compiled resources.arsc file when they reference well-known framework resources.

E.g., the manifest below had resource names removed. Note that most XML attribute names are missing.

<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android" :="1.1" android:versionCode="2" platformBuildVersionName="6.0-5078647" platformBuildVersionCode="23" package="com.virginoff.player">
    <uses-sdk :="7" :="23"/>
    <uses-permission :="android.permission.ACCESS_NETWORK_STATE"/>
    <uses-permission :="android.permission.SEND_SMS"/>
    <uses-permission :="android.permission.INTERNET"/>
    <uses-permission :="android.permission.WRITE_EXTERNAL_STORAGE"/>
    <uses-permission :="android.permission.WAKE_LOCK"/>
    <application :="@style/Theme.NoTitleBar.Fullscreen" :="Anal Sex Video" :="@drawable/ic_launcher" :=".Application" :="false">
        <activity :=".activity.WrapperActivity" :="true">
            <intent-filter>
                <action android:name="android.intent.action.MAIN"/>
                <category android:name="android.intent.category.LAUNCHER"/>
            </intent-filter>
        </activity>
        <activity :="o.ϟ" :="0" :="a0"/>
        <activity :="o.ȭ" :="0" :="a0"/>
        <service :="o.Ƃ"/>
    </application>
</manifest>

Dump the manifest using aapt to see the actual ids:

$ aapt dump xmltree 1.apk AndroidManifest.xml
N: android=http://schemas.android.com/apk/res/android
  E: manifest (line=0)
    A: :(0x0101021c)="1.1" (Raw: "1.1")
    A: android:versionCode(0x0101021b)=(type 0x10)0x2
    A: platformBuildVersionName="6.0-5078647" (Raw: "6.0-5078647")
    A: platformBuildVersionCode=(type 0x10)0x17
    A: package="com.virginoff.player" (Raw: "com.virginoff.player")
    E: uses-sdk (line=0)
      A: :(0x0101020c)=(type 0x10)0x7
      A: :(0x01010270)=(type 0x10)0x17
    E: uses-permission (line=0)
      A: :(0x01010003)="android.permission.ACCESS_NETWORK_STATE" (Raw: "android.permission.ACCESS_NETWORK_STATE")
...

Above, we can see that the uses-permission tag, for example, specifies the use of an attribute whose id is 0x01010003.

Attributes of an Android Manifest are well-known resources and stored as such in the Android framework. You can use aapt on the Android framework file to see them:

$ aapt dump --values resources ~/.jeb-android-frameworks/1.apk
Package Groups (1)
Package Group 0 id=0x01 packageCount=1 name=android
  Package 0 id=0x01 name=android
    type 0 configCount=1 entryCount=1543
      spec resource 0x01010000 android:attr/theme: flags=0x40000000
      spec resource 0x01010001 android:attr/label: flags=0x40000000
      spec resource 0x01010002 android:attr/icon: flags=0x40000000
 ---> spec resource 0x01010003 android:attr/name: flags=0x40000000
      spec resource 0x01010004 android:attr/manageSpaceActivity: flags=0x40000000
      spec resource 0x01010005 android:attr/allowClearUserData: flags=0x40000000
...

So, that tag could be restored to:

<uses-permission name="android.permission.ACCESS_NETWORK_STATE"/>

Note

The Android framework contains all base system resources for a given version of Android - it is located in the /system/framework/framework-res.apk (resources only) on a device, or the platforms/<APILEVEL>/android.jar in the Android SDK. JEB also drops the latest stable framework to your home folder's jeb-android-frameworks/1.apk

The above process is automated by JEB to restore XML files to a human-readable states.

Flattened hierarchy#

Although most structured resources (with the notable exception of the Manifest) are typically stored hierarchically under the res folder, they do not have to be. Some application protectors take advantage of this fact to flatten the resources tree, and for example, store them in the APK's root folder.

E.g., in the file below (a protected online banking app), most resource files were renamed to mangled names and stored alongside the Manifest; the res/ folder is present and contains only a handful of resources.

JEB restores both the hierarchy and names of those resource files.

Decoding problems#

Other oddities exist, they can be found in apps stretching the limits of the arsc format specifications to their boundaries.

They can be used, voluntarily or not, to thwart and crash various open-source tools. We won't detail them here, but you can find additional information here and there on our blog as well as the Apktool's GitHub issue tracker, a prime source to find weird parsing cases.

As an example, here is aapt2 (version around spring 2019) failing on version 153 of the Facebook app:

$ aapt2 dump Facebook_v153.0.0.54.88.apk
error: trying to add resource 'com.facebook.katana:id/(name removed)' with ID 0x7f090001 but resource already has ID 0x7f090000.

Assets#

Assets are unstructured resources. They can be of any type and stored anywhere in the APK archive. However, the assets/ directory is standard, and used by the Android AssetManager object.

Assets stored in the Resources folder res/raw are stored as-is (in particular, XML files are not encoded), and yet, are accessible in code by id, using the R class, just like any other standard resource.

The asset file below, edd.bin, is holding encrypted data

Native Code#

Android applications often contain native code, compiled as ELF library .so files. They can be located anywhere in the app. SO files can be loaded from bytecode via System.loadLibrary(simpleName) and System.load(path).

A common location for SO files is the app's lib/ folder. Libraries stored in this folder and adhering to the JNI naming convention allow the Android system to unpack appropriate SO to the device folder /data/data/<app>/lib, and make it easier for high-level code to load them, i.e. there is no need to implement the logic of figuring out which underlying platform the device is running on.

  • Location: [APK]/lib/<abi>/lib<name>.so
  • Example: high-level request System.loadLibrary("native-lib") on an aarch64 device => lib/arm64-v8a/libnative-lib.so on a Pixel phone with an Arm64 CPU

Info

For more information on native code analysis, debugging and decompilation, refer to the manual pages relative to native code.

Bytecode#

Refer to the DEX sub-section below.