Commit Graph

31 Commits

Author SHA1 Message Date
Wunk
e13735b624
video_core: Implement an arm64 shader-jit backend (#7002)
* externals: Add oaksim submodule

Used for emitting ARM64 assembly

* common: Implement aarch64 ABI

Utilize oaknut to implement a stack frame.

* tests: Allow shader-jit tests for x64 and a64

Run the shader-jit tests for both x86_64 and arm64 targets

* video_core: Initialize arm64 shader-jit backend

Passes all current unit tests!

* shader_jit_a64: protect/unprotect memory when jit-ing

Required on MacOS. Memory needs to be fully unprotected and then
re-protected when writing or there will be memory access errors on
MacOS.

* shader_jit_a64: Fix ARM64-Imm overflow

These conditionals were throwing exceptions since the immediate values
were overflowing the available space in the `EOR` instructions. Instead
they are generated from `MOV` and then `EOR`-ed after.

* shader_jit_a64: Fix Geometry shader conditional

* shader_jit_a64: Replace `ADRL` with `MOVP2R`

Fixes some immediate-generation exceptions.

* common/aarch64: Fix CallFarFunction

* shader_jit_a64: Optimize `SantitizedMul`

Co-authored-by: merryhime <merryhime@users.noreply.github.com>

* shader_jit_a64: Fix address register offset behavior

Based on https://github.com/citra-emu/citra/pull/6942
Passes unit tests.

* shader_jit_a64: Fix `RET` address offset

A64 stack is 16-byte aligned rather than 8. So a direct port of the x64
code won't work. Fixes weird branches into invalid memory for any
shaders with subroutines.

* shader_jit_a64: Increase max program size

Tuned for A64 program size.

* shader_jit_a64: Use `UBFX` for extracting loop-state

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Optimize `SUB+CMP` to `SUBS`

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Optimize `CMP+B` to `CBNZ`

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Use `FMOV` for `ONE` vector

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Remove x86-specific documentation

* shader_jit_a64: Use `UBFX` to extract exponent

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Remove redundant MIN/MAX `SRC2`-NaN check

Special handling only needs to check SRC1 for NaN, not SRC2.
It would work as follows in the four possible cases:

No NaN: No special handling needed.
Only SRC1 is NaN: The special handling is triggered because SRC1 is NaN, and SRC2 is picked.
Only SRC2 is NaN: FMAX automatically picks SRC2 because it always picks the NaN if there is one.
Both SRC1 and SRC2 are NaN: The special handling is triggered because SRC1 is NaN, and SRC2 is picked.

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit/tests:: Add catch-stringifier for vec2f/vec3f

* shader_jit/tests: Add Dest Mask unit test

* shader_jit_a64: Fix Dest-Mask `BSL` operand order

Passes the dest-mask unit tests now.

* shader_jit_a64: Use `MOVI` for DestEnable mask

Accelerate certain cases of masking with MOVI as well

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit/tests: Add source-swizzle unit test

This is not expansive. Generating all `4^4` cases seems to make Catch2
crash. So I've added some component-masking(non-reordering) tests based
on the Dest-Mask unit-test and some additional ones to test
broadcasts/splats and component re-ordering.

* shader_jit_a64: Fix swizzle index generation

This was still generating `SHUFPS` indices and not the ones that we wanted for the `TBL` instruction. Passes all unit tests now.

* shader_jit/tests: Add `ShaderSetup` constructor to `ShaderTest`

Rather than using the direct output of `CompileShaderSetup` allow a
`ShaderSetup` object to be passed in directly.  This enabled the ability
emit assembly that is not directly supported by nihstro.

* shader_jit/tests: Add `CALL` unit-test

Tests nested `CALL` instructions to eventually reach an `EX2`
instruction.

EX2 is picked in particular since it is implemented as an even deeper
dispatch and ensures subroutines are properly implemented between `CALL`
instructions and implementation-calls.

* shader_jit_a64: Fix nested `BL` subroutines

`lr` was getting writen over by nested calls to `BL`, causing undefined
behavior with mixtures of `CALL`, `EX2`, and `LG2` instructions.

Each usage of `BL` is now protected with a stach push/pop to preserve
and restore teh `lr` register to allow nested subroutines to work
properly.

* shader_jit/tests: Allocate generated tests on heap

Each of these generated shader-test objects were causing the stack to
overflow.  Allocate each of the generated tests on the heap and use
unique_ptr so they only exist within the life-time of the `REQUIRE`
statement.

* shader_jit_a64: Preserve `lr` register from external function calls

`EMIT` makes an external function call, and should be preserving `lr`

* shader_jit/tests: Add `MAD` unit-test

The Inline Asm version requires an upstream fix:
https://github.com/neobrain/nihstro/issues/68

Instead, the program code is manually configured and added.

* shader_jit/tests: Fix uninitialized instructions

These `union`-type instruction-types were uninitialized, causing tests
to indeterminantly fail at times.

* shader_jit_a64: Remove unneeded `MOV`

Residue from the direct-port of x64 code.

* shader_jit_a64: Use `std::array` for `instr_table`

Add some type-safety and const-correctness around this type as well.

* shader_jit_a64: Avoid c-style offset casting

Add some more const-correctness to this function as well.

* video_core: Add arch preprocessor comments

* common/aarch64: Use X16 as the veneer register

https://developer.arm.com/documentation/102374/0101/Procedure-Call-Standard

* shader_jit/tests: Add uniform reading unit-test

Particularly to ensure that addresses are being properly truncated

* common/aarch64: Use `X0` as `ABI_RETURN`

`X8` is used as the indirect return result value in the case that the
result is bigger than 128-bits. Principally `X0` is the general-case
return register though.

* common/aarch64: Add veneer register note

`LR` is generally overwritten by `BLR` anyways, and would also be a safe
veneer to utilize for far-calls.

* shader_jit_a64: Remove unneeded scratch register from `SanitizedMul`

* shader_jit_a64: Fix CALLU condition

Should be `EQ` not `NE`. Fixes the regression on Kid Icarus.
No known regressions anymore!

---------

Co-authored-by: merryhime <merryhime@users.noreply.github.com>
Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>
2023-11-05 21:40:31 +01:00
Steveice10
27bad3a699
audio_core: Replace AAC decoders with single FAAD2-based decoder. (#7098) 2023-11-04 14:56:13 -07:00
SachinVin
51996c54f0
audio_core\hle\adts_reader.cpp: Use BitField to parse ADTS header (#6719) 2023-07-28 12:15:58 -07:00
Steveice10
9d4609e29a
build: Bundle libraries in-place as well on MSVC. (#6665) 2023-07-06 02:37:06 +02:00
SachinVin
5311c939a2 tests/audio_core: add sanity test cases for LLE vs HLE 2023-05-25 20:23:21 +05:30
SachinVin
41f13456c0
Chore: Enable warnings as errors on MSVC (#6456)
* tests: add Sanity test for SplitFilename83

fix test

fix test

* disable `C4715:not all control paths return a value` for nihstro includes

nihstro: no warn

* Chore: Enable warnings as errors on msvc + fix warnings

fixes

some more warnings

clang-format

* more fixes

* Externals: Add target_compile_options `/W0` nihstro-headers and ...

Revert "disable `C4715:not all control paths return a value` for nihstro includes"
This reverts commit 606d79b55d.

* src\citra\config.cpp: ReadSetting: simplify type casting

* settings.cpp: Get*Name: remove superflous logs
2023-05-01 22:38:58 +03:00
Steveice10
ea649263b7
build: Improvements to bundled libraries support. (#6435) 2023-04-28 13:02:53 -07:00
Steveice10
a8848cce43 build: Update to support multi-arch builds. 2023-01-07 01:09:32 -08:00
Tobias
ccb50e7f2c
Port yuzu-emu/yuzu#9300: "CMake: Use precompiled headers to improve compile times" (#6213)
Co-authored-by: Ameer J <52414509+ameerj@users.noreply.github.com>
2022-12-17 16:06:38 +01:00
GPUCode
cbd5d1c15c
Upgrade codebase to C++ 20 + fix warnings + update submodules (#6115) 2022-09-21 18:36:12 +02:00
liushuyu
06316be8a7 audio_core: hle: mf: minor fix 2019-02-09 11:56:51 +01:00
B3N30
80b4dd21d2 audio_core: dsp_hle: add Media Foundation decoder...
* appveyor: switch to Media Foundation API
* Travis CI MinGW build needs an update with the container image
2019-02-09 11:56:51 +01:00
Weiyi Wang
055b9513a3 common/bitfield: make it endianness-aware 2019-01-28 22:09:43 -05:00
Yuri Kunde Schlesner
23506e8eae
Tests: Remove glad test OS X work-around 2018-09-01 11:44:24 +08:00
Subv
07089cfb3c Tests: Added some tests for the VMManager class.
Covering basic operations like mapping, unmapping, reprotecting and changing memory state.
2018-01-23 08:21:12 -05:00
Lioncash
ab021d163e CMakeLists: Derive the source directory grouping from targets themselves
Removes the need to store to separate SRC and HEADER variables,
and then construct the target in most cases.
2017-12-11 21:11:52 -05:00
Merry
e23c3cd7f7
Merge pull request #3145 from MerryMage/lg2-ex2
shader_jit_x64_compiler: Remove ABI overhead of LG2 and EX2
2017-12-03 17:38:38 +00:00
MerryMage
235a251d3c tests: Add tests for x64 shader jit
Tests LG2 and EX2 instructions
2017-11-30 18:17:35 +00:00
B3n30
e9a95b2e7d
CoreTiming: Reworked CoreTiming (#3119)
* CoreTiming: New CoreTiming; Add Test for CoreTiming
2017-11-25 14:56:57 +01:00
Subv
a8d2f5787f Tests: Added Memory::IsValidVirtualAddress tests. 2017-09-26 17:31:50 -05:00
MerryMage
a08edd67eb tests: Add tests for vadd 2017-07-23 12:29:51 +01:00
MerryMage
567c3a2ee7 tests: Arm testing framework 2017-07-23 12:08:43 +01:00
Yuri Kunde Schlesner
60a882cd50 Kernel/IPC: Add tests for HLERequestContext buffer translation 2017-06-18 16:05:58 -07:00
Yuri Kunde Schlesner
cebdae6c92 CMake: Create an INTERFACE target for Catch 2017-05-27 22:46:59 -07:00
Yuri Kunde Schlesner
7b81903756 CMake: Correct inter-module dependencies and library visibility
Modules didn't correctly define their dependencies before, which relied
on the frontends implicitly including every module for linking to
succeed.

Also changed every target_link_libraries call to specify visibility of
dependencies to avoid leaking definitions to dependents when not
necessary.
2017-05-27 18:41:24 -07:00
wwylele
8a8c0f348b Common: add ParamPackage 2017-03-01 23:30:57 +02:00
Jan Beich
774d3112af tests: add missing libcore dependency after 75ee2f8c67
$ (cmake -DENABLE_SDL2:BOOL=false /path/to/citra; gmake)
[...]
[ 85%] Linking CXX executable tests
../common/libcommon.a(microprofile.cpp.o): In function `MicroProfileThreadStart(pthread**, void* (*)(void*))':
src/common/microprofile.cpp:(.text+0x41): undefined reference to `pthread_create'
c++: error: linker command failed with exit code 1 (use -v to see invocation)
2016-12-07 18:30:49 +00:00
wwylele
282195b450 tests: add a work-around for macOS linking error 2016-11-19 18:55:35 +02:00
wwylele
75ee2f8c67 FileSys: add PathParser 2016-11-19 17:17:19 +02:00
MerryMage
87de1ca968 Tests: Run tests on CI 2016-05-19 19:28:08 +01:00
MerryMage
a03f9b6fb6 tests: Infrastructure for unit tests 2016-05-19 08:38:03 +01:00