apple_generative_model_safety_decrypted

Decrypted Generative Model safety files for Apple Intelligence containing filters

Structure

decrypted_overrides/: Contains decrypted overrides for various models.
- com.apple.*/: Directory named using the Asset Specifier assosciated with the safety info
  - Info.plist: Contains metadata for the override
  - AssetData/: Contains the decrypted JSON files
get_key_lldb.py: Script to get the encryption key (see usage info below)
decrypt_overrides.py: Script to decrypt the overrides (see usage info below)

Usage

Python dependencies

cryptography is the only dependency required to run the decryption script. You can install it using pip:

pip install cryptography

Getting the encryption key

To retrieve the encryption key (generated by ModelCatalog.Obfuscation.readObfuscatedContents) for the overrides, you must attach LLDB to GenerativeExperiencesSafetyInferenceProvider ( /System/Library/ExtensionKit/Extensions/GenerativeExperiencesSafetyInferenceProvider.appex/Contents/MacOS/GenerativeExperiencesSafetyInferenceProvider). Also it is important that this is Xcode's LLDB, not the default macOS one or LLVM's lldb. The method I recommend to get LLDB to attach:

Run sudo killall GenerativeExperiencesSafetyInferenceProvider; sudo xcrun lldb -w -n GenerativeExperiencesSafetyInferenceProvider /System/Library/ExtensionKit/Extensions/GenerativeExperiencesSafetyInferenceProvider.appex/Contents/MacOS/GenerativeExperiencesSafetyInferenceProvider
In the Shortcuts app, create a dummy shortcut that uses the Generative Model action ("Use Model") and select the On-Device option. Type whatever you want into the text field, it doesn't matter. Then run the shortcut.
You should see LLDB attach to (the newly started instance of) GenerativeExperiencesSafetyInferenceProvider with a message like this:

(lldb) process attach --name "GenerativeExperiencesSafetyInferenceProvider" --waitfor
Process 53629 stopped
* thread #1, stop reason = signal SIGSTOP
    frame #0: 0x00000001839f41f8 dyld`dyld4::PrebuiltLoader::dependent(dyld4::RuntimeState const&, unsigned int, mach_o::LinkedDylibAttributes*) const + 116
dyld`dyld4::PrebuiltLoader::dependent:
->  0x1839f41f8 <+116>: add    x0, sp, #0xe
    0x1839f41fc <+120>: mov    x1, x19
    0x1839f4200 <+124>: bl     0x1839e50dc    ; dyld4::Loader::LoaderRef::loader(dyld4::RuntimeState const&) const
    0x1839f4204 <+128>: ldrh   w8, [x20, #0x4]
Target 0: (GenerativeExperiencesSafetyInferenceProvider) stopped.
Executable binary set to "/System/Library/ExtensionKit/Extensions/GenerativeExperiencesSafetyInferenceProvider.appex/Contents/MacOS/GenerativeExperiencesSafetyInferenceProvider".
Architecture set to: arm64e-apple-macosx-.

In this repository's root, run the command in LLDB: command script import get_key_lldb.py
Then run c to continue the process. LLDB will print the encryption key to the console and save it to ./key.bin.

Decrypting the overrides

To decrypt the overrides, run the following command in the root of this repository:

python decrypt_overrides.py /System/Library/AssetsV2/com_apple_MobileAsset_UAF_FM_Overrides/purpose_auto \
  -k key.bin \
  -o decrypted_overrides

The decrypted_overrides directory will be created if it does not exist, and the decrypted overrides will be placed in it. This is only necessary if the overrides have been updated, there is already a decrypted version of the overrides in this repository that is up to date as of June 28, 2025.

Understanding the overrides

The overrides are JSON files that contain safety filters for various generative models. Each override is associated with a specific model context (from what I can tell) and contains rules that determine how the model should behave in certain situations, such as filtering out harmful content or ensuring compliance with safety standards.

Here is an example of one of the overrides metadata.json file sourced from dec_out_repo/decrypted_overrides/com.apple.gm.safety_deny.output.code_intelligence.base. Note the output part of the specifier, which indicates that this is a safety override for model output rather than user input:

{
  "reject": [
    "xylophone copious opportunity defined elephant 10out",
    "xylophone copious opportunity defined elephant out"
  ],
  "remove": [],
  "replace": {},
  "regexReject": [
    "(?i)\\bbitch\\b",
    "(?i)\\bdago\\b",
    "(?i)\\bdyke\\b",
    "(?i)\\bhebe\\b",
    ...
  ],
  "regexRemove": [],
  "regexReplace": {}
}

Here, the reject field contains exact phrases which will result in a guardrail violation. The remove field contains phrases that will be removed from the output, while the replace field contains phrases that will be replaced with other phrases. The regexReject, regexRemove, and regexReplace fields contain regular expressions that will be used to match and filter content in a similar manner.