I extracted the safety filters from Apple Intelligence models
apple_generative_model_safety_decrypted
Decrypted Generative Model safety files for Apple Intelligence containing filters
Structure
decrypted_overrides/
: Contains decrypted overrides for various models.com.apple.*/
: Directory named using the Asset Specifier assosciated with the safety infoInfo.plist
: Contains metadata for the overrideAssetData/
: Contains the decrypted JSON files
get_key_lldb.py
: Script to get the encryption key (see usage info below)decrypt_overrides.py
: Script to decrypt the overrides (see usage info below)
Usage
Python dependencies
cryptography
is the only dependency required to run the decryption script. You can install it using pip:
pip install cryptography
Getting the encryption key
To retrieve the encryption key (generated by ModelCatalog.Obfuscation.readObfuscatedContents) for the overrides, you must attach LLDB to GenerativeExperiencesSafetyInferenceProvider ( /System/Library/ExtensionKit/Extensions/GenerativeExperiencesSafetyInferenceProvider.appex/Contents/MacOS/GenerativeExperiencesSafetyInferenceProvider
). Also it is important that this is Xcode's LLDB, not the default macOS one or LLVM's lldb. The method I recommend to get LLDB to attach:
- Run
sudo killall GenerativeExperiencesSafetyInferenceProvider; sudo xcrun lldb -w -n GenerativeExperiencesSafetyInferenceProvider /System/Library/ExtensionKit/Extensions/GenerativeExperiencesSafetyInferenceProvider.appex/Contents/MacOS/GenerativeExperiencesSafetyInferenceProvider
- In the Shortcuts app, create a dummy shortcut that uses the Generative Model action ("Use Model") and select the On-Device option. Type whatever you want into the text field, it doesn't matter. Then run the shortcut.
- You should see LLDB attach to (the newly started instance of) GenerativeExperiencesSafetyInferenceProvider with a message like this:
(lldb) process attach --name "GenerativeExperiencesSafetyInferenceProvider" --waitfor
Process 53629 stopped
* thread #1, stop reason = signal SIGSTOP
frame #0: 0x00000001839f41f8 dyld`dyld4::PrebuiltLoader::dependent(dyld4::RuntimeState const&, unsigned int, mach_o::LinkedDylibAttributes*) const + 116
dyld`dyld4::PrebuiltLoader::dependent:
-> 0x1839f41f8 <+116>: add x0, sp, #0xe
0x1839f41fc <+120>: mov x1, x19
0x1839f4200 <+124>: bl 0x1839e50dc ; dyld4::Loader::LoaderRef::loader(dyld4::RuntimeState const&) const
0x1839f4204 <+128>: ldrh w8, [x20, #0x4]
Target 0: (GenerativeExperiencesSafetyInferenceProvider) stopped.
Executable binary set to "/System/Library/ExtensionKit/Extensions/GenerativeExperiencesSafetyInferenceProvider.appex/Contents/MacOS/GenerativeExperiencesSafetyInferenceProvider".
Architecture set to: arm64e-apple-macosx-.
- In this repository's root, run the command in LLDB:
command script import get_key_lldb.py
- Then run
c
to continue the process. LLDB will print the encryption key to the console and save it to./key.bin
.
Decrypting the overrides
To decrypt the overrides, run the following command in the root of this repository:
python decrypt_overrides.py /System/Library/AssetsV2/com_apple_MobileAsset_UAF_FM_Overrides/purpose_auto \ -k key.bin \ -o decrypted_overrides
The decrypted_overrides
directory will be created if it does not exist, and the decrypted overrides will be placed in it. This is only necessary if the overrides have been updated, there is already a decrypted version of the overrides in this repository that is up to date as of June 28, 2025.
Understanding the overrides
The overrides are JSON files that contain safety filters for various generative models. Each override is associated with a specific model context (from what I can tell) and contains rules that determine how the model should behave in certain situations, such as filtering out harmful content or ensuring compliance with safety standards.
Here is an example of one of the overrides metadata.json
file sourced from dec_out_repo/decrypted_overrides/com.apple.gm.safety_deny.output.code_intelligence.base
. Note the output
part of the specifier, which indicates that this is a safety override for model output rather than user input:
{ "reject": [ "xylophone copious opportunity defined elephant 10out", "xylophone copious opportunity defined elephant out" ], "remove": [], "replace": {}, "regexReject": [ "(?i)\\bbitch\\b", "(?i)\\bdago\\b", "(?i)\\bdyke\\b", "(?i)\\bhebe\\b", ... ], "regexRemove": [], "regexReplace": {} }
Here, the reject
field contains exact phrases which will result in a guardrail violation. The remove
field contains phrases that will be removed from the output, while the replace
field contains phrases that will be replaced with other phrases. The regexReject
, regexRemove
, and regexReplace
fields contain regular expressions that will be used to match and filter content in a similar manner.
What's Your Reaction?






