Commit Graph

16704 Commits

Author SHA1 Message Date
MerryMage
c09a9e5cc7 macro_jit_x64: Select better registers
All registers are now callee-save registers.

RBX and RBP selected for STATE and RESULT because these are most commonly accessed; this is to avoid the REX prefix.
RBP not used for STATE because there are some SIB restrictions, RBX emits smaller code.
2020-06-15 21:19:38 +01:00
MerryMage
79aa7b3ace macro_jit_x64: Remove REGISTERS
Unnecessary since this is just an offset from STATE.
2020-06-15 21:00:59 +01:00
MerryMage
35db6e1c68 macro_jit_x64: Remove JITState::parameters
This can be passed in as an argument instead.
2020-06-15 20:55:02 +01:00
MerryMage
389549b80d macro_jit_x64: Remove METHOD_ADDRESS_64
Unnecessary variable.
2020-06-15 20:51:33 +01:00
MerryMage
a6a43a5ae0 macro_jit_x64: Remove RESULT_64
This Reg64 codepath has the exact same behaviour as the Reg32 one.
2020-06-15 20:35:08 +01:00
MerryMage
7c6203dc5e xbyak_abi: Prefer returning a struct to using out parameters in ABI_CalculateFrameSize 2020-06-15 19:07:11 +01:00
MerryMage
36362e9695 xbyak_abi: Register indexes should be unsigned 2020-06-15 19:07:11 +01:00
MerryMage
d563017dfe xbyak_abi: Remove *GPS variants of stack manipulation functions 2020-06-15 18:59:54 +01:00
MerryMage
4417770ba9 xbyak_abi: Fix ABI_PushRegistersAndAdjustStack
Pushing GPRs twice.
2020-06-15 18:59:01 +01:00
David
5c9dee2c94
Merge pull request #4085 from ReinUsesLisp/gcc-times
video_core/macro_jit_x64: Remove initializer in member variable
2020-06-15 23:05:21 +10:00
ReinUsesLisp
6e5d8aac4d video_core/macro_jit_x64: Remove initializer in member variable
Fix build time issues on gcc. Confirmed through asan that avoiding this
initialization is safe.
2020-06-15 05:17:55 -03:00
bunnei
55ebf68636
Merge pull request #4070 from ogniK5377/GetTPCMasks-fix
nvdrv: Fix GetTPCMasks for ioctl3
2020-06-14 20:12:45 -04:00
VolcaEM
39213b1c59
Clang-format again 2020-06-14 19:41:28 +02:00
VolcaEM
198b0fa790
Use consistent variable names 2020-06-14 19:37:44 +02:00
VolcaEM
1520d7865d
Clang-format 2020-06-14 19:34:58 +02:00
VolcaEM
761d206049
Make assert strings consistent 2020-06-14 19:30:08 +02:00
VolcaEM
151a3fe7b3
Attempt to fix crashes in SSBU and refactor IsValidNRO 2020-06-14 19:28:39 +02:00
bunnei
89d11f2268
Merge pull request #4069 from ogniK5377/total-phys-mem
kernel: Account for system resource size for memory usage
2020-06-14 00:44:34 -04:00
bunnei
92021a344c
Merge pull request #4064 from ReinUsesLisp/invalidate-buffers
gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidation
2020-06-14 00:29:16 -04:00
bunnei
c2ea1e1bcb
Merge pull request #4049 from ReinUsesLisp/separate-samplers
shader/texture: Join separate image and sampler pairs offline
2020-06-13 13:48:27 -04:00
David Marcec
42250427c5 audren: Implement RendererInfo
Fixes ZLA softlock
2020-06-13 14:04:28 +10:00
bunnei
5633887569
Merge pull request #3986 from ReinUsesLisp/shader-cache
shader_cache: Implement a generic runtime shader cache
2020-06-12 23:14:48 -04:00
bunnei
e1911e5c8b
Merge pull request #4010 from ogniK5377/reserve-always-break
kernel: ResourceLimit::Reserve remove useless while loop
2020-06-12 22:30:19 -04:00
ReinUsesLisp
87011a97f9 gl_arb_decompiler: Implement FSwizzleAdd 2020-06-11 22:12:07 -03:00
ReinUsesLisp
a63a0daa5e gl_arb_decompiler: Implement an assembly shader decompiler
Emit code compatible with NV_gpu_program5.
This should emit code compatible with Fermi, but it wasn't tested on
that architecture. Pascal has some issues not present on Turing GPUs.
2020-06-11 22:12:07 -03:00
ReinUsesLisp
d89888389d yuzu/configuration: Show assembly shaders check box 2020-06-10 19:04:53 -03:00
David Marcec
b15cbf9bcf nvdrv: Fix GetTPCMasks for ioctl3
Fixes animal crossing svcBreak on launch
2020-06-10 18:36:42 +10:00
David Marcec
74ff1db758 kernel: Account for system resource size for memory usage
GetTotalPhysicalMemoryAvailableWithoutSystemResource & GetTotalPhysicalMemoryUsedWithoutSystemResource seem to subtract the resource size from the usage.
2020-06-10 14:49:00 +10:00
bunnei
83e3b77ed7
Merge pull request #4027 from ReinUsesLisp/3d-slices
texture_cache: Implement rendering to 3D textures
2020-06-09 21:52:15 -04:00
ReinUsesLisp
6508cdd003 buffer_cache: Avoid passing references of shared pointers and misc style changes
Instead of using as template argument a shared pointer, use the
underlying type and manage shared pointers explicitly. This can make
removing shared pointers from the cache more easy.

While we are at it, make some misc style changes and general
improvements (like insert_or_assign instead of operator[] + operator=).
2020-06-09 18:30:49 -03:00
ReinUsesLisp
7646f2c21d gl_rasterizer: Mark vertex buffers as dirty after buffer cache invalidation
Vertex buffers bindings become invalid after the stream buffer is
invalidated. We were originally doing this, but it got lost at some
point.

- Fixes Animal Crossing: New Horizons, but it affects everything.
2020-06-08 20:24:16 -03:00
ReinUsesLisp
6e122f0b2c buffer_cache: Return stream buffer invalidation in Map instead of Unmap
We have to invalidate whatever cache is being used before uploading the
data, hence it makes more sense to return this on Map instead of Unmap.
2020-06-08 20:22:31 -03:00
unknown
20a779299a Add game versio to title bar 2020-06-08 23:58:04 +02:00
bunnei
3626254f48
Merge pull request #4040 from ReinUsesLisp/nv-transform-feedback
gl_rasterizer: Use NV_transform_feedback for XFB on assembly shaders
2020-06-08 16:18:33 -04:00
bunnei
98d2461529
Merge pull request #4052 from ReinUsesLisp/debug-output
renderer_opengl: Only enable DEBUG_OUTPUT when graphics debugging is enabled
2020-06-08 10:16:41 -04:00
ReinUsesLisp
bd43c05470 texture_cache: Port original code management for 2D vs 3D textures
Handle blits to images as 2D, even when they have block depth.

- Fixes rendering issues on Luigi's Mansion 3
2020-06-08 05:02:22 -03:00
ReinUsesLisp
c99f5d405b texture_cache: Simplify blit code 2020-06-08 05:01:44 -03:00
ReinUsesLisp
3c2ae53b4c texture_cache: Handle 3D texture blits with one layer 2020-06-08 05:01:00 -03:00
ReinUsesLisp
c95c254f3e texture_cache: Implement rendering to 3D textures
This allows rendering to 3D textures with more than one slice.
Applications are allowed to render to more than one slice of a texture
using gl_Layer from a VTG shader.

This also requires reworking how 3D texture collisions are handled, for
now, this commit allows rendering to slices but not to miplevels. When a
render target attempts to write to a mipmap, we fallback to the previous
implementation (copying or flushing as needed).

- Fixes color correction 3D textures on UE4 games (rainbow effects).
- Allows Xenoblade games to render to 3D textures directly.
2020-06-08 05:01:00 -03:00
Rodrigo Locatti
2293e8a11a
Merge pull request #4034 from ReinUsesLisp/storage-texels
vk_rasterizer: Implement storage texels and atomic image operations
2020-06-07 18:43:24 -03:00
ReinUsesLisp
abcea1bb18 rasterizer_cache: Remove files and includes
The rasterizer cache is no longer used. Each cache has its own generic
implementation optimized for the cached data.
2020-06-07 04:32:57 -03:00
ReinUsesLisp
678f95e4f8 vk_pipeline_cache: Use generic shader cache
Trivial port the generic shader cache to Vulkan.
2020-06-07 04:32:57 -03:00
ReinUsesLisp
b96f65b62b gl_shader_cache: Use generic shader cache
Trivially port the generic shader cache to OpenGL.
2020-06-07 04:32:57 -03:00
ReinUsesLisp
dc27252352 shader_cache: Implement a generic shader cache
Implement a generic shader cache for fast lookups and invalidations.
Invalidations are cheap but expensive when a shader is invalidated.

Use two mutexes instead of one to avoid locking invalidations for
lookups and vice versa. When a shader has to be removed, lookups are
locked as expected.
2020-06-07 04:32:32 -03:00
Morph
03fad5ebe8 yuzu/frontend: Remove internal resolution option 2020-06-06 15:56:14 -04:00
bunnei
03fd5aa384
Merge pull request #4055 from ReinUsesLisp/nvidia-443-24
gl_device: Black list NVIDIA 443.24 for fast buffer uploads
2020-06-06 02:37:24 -04:00
ReinUsesLisp
e78d681a6c gl_device: Black list NVIDIA 443.24 for fast buffer uploads
Skip fast buffer uploads on Nvidia 443.24 Vulkan beta driver on OpenGL.
This driver throws the following error when calling BufferSubData or
BufferData on buffers that are candidates for fast constant buffer
uploads. This is the equivalens to push constants on Vulkan, except that
they can access the full buffer. The error:

Unknown internal debug message. The NVIDIA OpenGL driver has encountered
an out of memory error. This application might
behave inconsistently and fail.

If this error persists on future drivers, we might have to look deeper
into this issue. For now, we can black list it and log it as a temporary
solution.
2020-06-06 02:56:42 -03:00
ReinUsesLisp
354fbe701e renderer_opengl: Only enable DEBUG_OUTPUT when graphics debugging is enabled
Avoids logging when it's not relevant. This can potentially reduce
driver's internal thread overhead.
2020-06-05 21:21:12 -03:00
bunnei
98671b4cfe
Merge pull request #4013 from ReinUsesLisp/skip-no-xfb
vk_rasterizer: Skip transform feedbacks when extension is unavailable
2020-06-05 11:14:36 -04:00
ReinUsesLisp
5b2b6d594c shader/texture: Join separate image and sampler pairs offline
Games using D3D idioms can join images and samplers when a shader
executes, instead of baking them into a combined sampler image. This is
also possible on Vulkan.

One approach to this solution would be to use separate samplers on
Vulkan and leave this unimplemented on OpenGL, but we can't do this
because there's no consistent way of determining which constant buffer
holds a sampler and which one an image. We could in theory find the
first bit and if it's in the TIC area, it's an image; but this falls
apart when an image or sampler handle use an index of zero.

The used approach is to track for a LOP.OR operation (this is done at an
IR level, not at an ISA level), track again the constant buffers used as
source and store this pair. Then, outside of shader execution, join
the sample and image pair with a bitwise or operation.

This approach won't work on games that truly use separate samplers in a
meaningful way. For example, pooling textures in a 2D array and
determining at runtime what sampler to use.

This invalidates OpenGL's disk shader cache :)

- Used mostly by D3D ports to Switch
2020-06-05 00:24:51 -03:00