Build universal macOS simulator binaries that work on both Intel and Apple Silicon Macs by default. The build system now:
- Defaults to building for both x86_64 and ARM64 architectures
- Uses lipo to create universal binaries
- Allows single-architecture builds via ARCHS=arm64 or ARCHS=x86_64
- Conditionally includes the appropriate assembly files for each architecture