Fernando Sahmkow
d267948a73
texture_cache: loose TryReconstructSurface when accurate GPU is not on.
...
Also corrects some asserts.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
6bd034eae9
engine_upload: Addapt to new Texture Cache
2019-06-20 21:36:12 -03:00
ReinUsesLisp
345e73f2fe
video_core: Use un-shifted block sizes to avoid integer divisions
...
Instead of storing all block width, height and depths in their shifted
form:
block_width = 1U << block_shift;
Store them like they are provided by the emulated hardware (their
block_shift form). This way we can avoid doing the costly
Common::AlignUp operation to align texture sizes and drop CPU integer
divisions with bitwise logic (defined in Common::AlignBits).
2019-06-20 21:36:12 -03:00
bunnei
c7b5c245e1
Merge pull request #2562 from ReinUsesLisp/split-cbuf-upload
...
video_core/engines: Move ConstBufferInfo out of Maxwell3D
2019-06-17 22:35:04 -04:00
ReinUsesLisp
528c15051c
kepler_compute: Use std::array for cbuf info
2019-06-07 20:36:22 -03:00
ReinUsesLisp
17d5fb6d06
kepler_compute: Fix block_dim_x encoding
2019-06-07 20:35:46 -03:00
ReinUsesLisp
2f2a61887a
video_core/engines: Move ConstBufferInfo out of Maxwell3D
2019-06-07 19:47:15 -03:00
Fernando Sahmkow
a32c52b1d8
shader_bytecode: Mark EXIT as flow instruction
2019-06-04 12:18:35 -04:00
ReinUsesLisp
75e7b45d69
shader/memory: Implement ST (generic memory)
2019-05-20 22:41:53 -03:00
ReinUsesLisp
f78ef617b6
shader/memory: Implement LD (generic memory)
2019-05-20 22:38:59 -03:00
bunnei
d49efbfb4a
Merge pull request #2441 from ReinUsesLisp/al2p
...
shader: Implement AL2P and ALD.PHYS
2019-05-19 14:02:58 -04:00
Hexagon12
b54bd3f018
Merge pull request #2472 from FernandoS27/tic
...
maxwell_3d: reduce severity of different component formats assert.
2019-05-19 15:04:47 +01:00
Hexagon12
3bd5f01240
Merge pull request #2469 from lioncash/copyable
...
video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
2019-05-19 15:02:17 +01:00
Sebastian Valle
a6ed792ac4
Merge pull request #2470 from lioncash/ranged-for
...
video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
2019-05-19 09:01:19 -05:00
Fernando Sahmkow
fc975e9021
maxwell_3d: reduce sevirity of different component formats assert.
...
This was reduced due to happening on most games and at such constant
rate that it affected performance heavily for the end user. In general,
we are well aware of the assert and an implementation is already
planned.
2019-05-14 17:12:54 -04:00
Lioncash
b01cce716e
video_core/engines/engine_upload: Amend constructor initializer list order
...
Silences a -Wreorder warning.
2019-05-14 13:43:28 -04:00
Lioncash
9b6d993e52
video_core/engines/engine_upload: Default destructor in the cpp file
...
Avoids inlining destruction logic where applicable, and also makes
forward declarations not cause unexpected compilation errors depending
on where the State class is used.
2019-05-14 13:41:41 -04:00
Lioncash
ec1c69258a
video_core/engines/engine_upload: Remove unnecessary const on parameters in function declarations
...
These only apply in the definition of the function. They can be omitted
from the declaration.
2019-05-14 13:40:09 -04:00
Lioncash
0f83c8dffa
video_core/engines/engine_upload: Remove unnecessary includes
2019-05-14 13:39:04 -04:00
Lioncash
5db1b54b58
video_core/engines/maxwell3d: Get rid of three magic values in CallMethod()
...
We can use the named constant instead of using 32 directly.
2019-05-14 09:02:47 -04:00
Lioncash
48ce5880a0
video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
...
Lessens the amount of code that needs to be read, and gets rid of the
need to introduce an indexing variable. Instead, we just operate on the
objects directly.
2019-05-14 08:53:19 -04:00
Lioncash
c212fc9b2c
video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
...
std::memset is used to clear the entire register structure, which
requires that the Regs struct be trivially copyable (otherwise undefined
behavior is invoked). This prevents the case where a non-trivial type is
potentially added to the struct.
2019-05-14 08:47:56 -04:00
bunnei
c27b81cb85
Merge pull request #2429 from FernandoS27/compute
...
Corrections and Implementation on GPU Engines
2019-05-09 13:19:22 -04:00
ReinUsesLisp
d4df803b2b
shader_ir/other: Implement IPA.IDX
2019-05-02 21:46:37 -03:00
ReinUsesLisp
71aa9d0877
shader_ir/memory: Implement physical input attributes
2019-05-02 21:46:25 -03:00
ReinUsesLisp
bd81a03d9d
gl_shader_decompiler: Declare all possible varyings on physical attribute usage
2019-05-02 21:46:25 -03:00
ReinUsesLisp
7632a7d6d2
shader_bytecode: Add AL2P decoding
2019-05-02 21:46:25 -03:00
Fernando Sahmkow
e64c41efe8
Refactors and name corrections.
2019-05-01 15:31:39 -04:00
bunnei
c52233ec8b
Merge pull request #2322 from ReinUsesLisp/wswitch
...
video_core: Silent -Wswitch warnings
2019-04-28 22:24:58 -04:00
Fernando Sahmkow
b3118ee316
Fixes and Corrections to DMA Engine
2019-04-23 15:28:18 -04:00
Fernando Sahmkow
f1e5314f1a
Add Swizzle Parameters to the DMA engine
2019-04-23 11:21:00 -04:00
Fernando Sahmkow
e140e2ebc6
Add Documentation Headers to all the GPU Engines
2019-04-23 08:44:52 -04:00
Fernando Sahmkow
021d28c9b8
Corrections and styling
2019-04-23 08:02:24 -04:00
Fernando Sahmkow
701ce1c9d0
Implement Maxwell3D Data Upload
2019-04-22 19:27:36 -04:00
Fernando Sahmkow
e4ff140b99
Introduce skeleton of the GPU Compute Engine.
2019-04-22 19:05:43 -04:00
Fernando Sahmkow
a91d3fc639
Revamp Kepler Memory to use a subegine to manage uploads
2019-04-22 18:50:56 -04:00
bunnei
68b707711a
Merge pull request #2411 from FernandoS27/unsafe-gpu
...
GPU Manager: Implement ReadBlockUnsafe and WriteBlockUnsafe
2019-04-22 17:09:00 -04:00
bunnei
01100f8afd
Merge pull request #2400 from FernandoS27/corret-kepler-mem
...
Implement Kepler Memory on both Linear and BlockLinear.
2019-04-22 16:47:05 -04:00
bunnei
da0c3bc658
Merge pull request #2407 from FernandoS27/f2f
...
Do some corrections in conversion shader instructions.
2019-04-20 00:42:34 -04:00
ReinUsesLisp
fbe8d1ceaa
video_core: Silent -Wswitch warnings
2019-04-18 15:54:39 -03:00
bunnei
5bd5140bde
Merge pull request #2348 from FernandoS27/guest-bindless
...
Implement Bindless Textures on Shader Decompiler and GL backend
2019-04-17 20:59:49 -04:00
bunnei
0cfbd3325b
Merge pull request #2315 from ReinUsesLisp/severity-decompiler
...
shader_ir/decode: Reduce the severity of common assertions
2019-04-16 22:21:19 -04:00
Fernando Sahmkow
ef381e6924
Use ReadBlockUnsafe on TIC and TSC reading
...
Use ReadBlockUnsafe on TIC and TSC reading as memory is never flushed
from host GPU there.
2019-04-15 23:10:24 -04:00
Fernando Sahmkow
3e96c367bd
Use WriteBlock and ReadBlock.
2019-04-15 22:42:34 -04:00
Fernando Sahmkow
bec28d692d
Implement Block Linear copies in Kepler Memory.
2019-04-15 21:22:16 -04:00
Fernando Sahmkow
aa471274d9
Do some corrections in conversion shader instructions.
...
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
Fernando Sahmkow
8a099ac99f
Correct Kepler Memory on Linear Pushes.
2019-04-15 14:51:36 -04:00
ReinUsesLisp
5c280e6ff0
shader_ir: Implement STG, keep track of global memory usage and flush
2019-04-14 00:25:32 -03:00
bunnei
353a099481
Merge pull request #2366 from FernandoS27/xmad-fix
...
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
Fernando Sahmkow
5c55ae4e18
Correct LOP_IMN encoding
2019-04-08 13:39:12 -04:00
Fernando Sahmkow
16adc735a5
Correct XMAD mode, psl and high_b on different encodings.
2019-04-08 13:01:17 -04:00
Fernando Sahmkow
492040bd9c
Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format.
2019-04-08 11:36:11 -04:00
Fernando Sahmkow
4841440382
Implement TXQ_B
2019-04-08 11:29:52 -04:00
Fernando Sahmkow
ac3ba9a33e
Corrections to TEX_B
2019-04-08 11:28:44 -04:00
Fernando Sahmkow
7af82ca022
Implement Bindless Handling on SetupTexture
2019-04-08 11:23:46 -04:00
Fernando Sahmkow
e28fd3d0a5
Implement Bindless Samplers and TEX_B in the IR.
2019-04-08 11:23:42 -04:00
ReinUsesLisp
ddcb711ee8
maxwell_3d: Reduce severity of ProcessSyncPoint
2019-04-06 02:18:20 -03:00
bunnei
864280fabc
Merge pull request #2317 from FernandoS27/sync
...
Implement SyncPoint Register in the GPU.
2019-04-05 23:50:54 -04:00
Fernando Sahmkow
fc91e21206
Implement SyncPoint Register in the GPU.
2019-04-05 19:19:30 -04:00
Lioncash
22f02076c6
video_core/engines: Make memory manager members private
...
These aren't used externally by anything, so they can be made private
data members.
2019-04-05 18:26:43 -04:00
Lioncash
26223f8124
video_core/engines: Remove unnecessary inclusions where applicable
...
Replaces header inclusions with forward declarations where applicable
and also removes unused headers within the cpp file. This reduces a few
more dependencies on core/memory.h
2019-04-05 18:26:32 -04:00
ReinUsesLisp
04979560fb
shader_ir/memory: Reduce severity of LD_L cache management and log it
2019-04-03 17:12:44 -03:00
ReinUsesLisp
24abeb9a67
shader_ir/memory: Reduce severity of ST_L cache management and log it
2019-04-03 17:12:44 -03:00
bunnei
19330f45d3
maxwell_dma: Check for valid source in destination before copy.
...
- Avoid a crash in Octopath Traveler.
2019-03-20 22:36:03 -04:00
bunnei
22d3dfbcd4
gpu: Rewrite virtual memory manager using PageTable.
2019-03-20 22:36:02 -04:00
bunnei
574e89d924
video_core: Refactor to use MemoryManager interface for all memory access.
...
# Conflicts:
# src/video_core/engines/kepler_memory.cpp
# src/video_core/engines/maxwell_3d.cpp
# src/video_core/morton.cpp
# src/video_core/morton.h
# src/video_core/renderer_opengl/gl_global_cache.cpp
# src/video_core/renderer_opengl/gl_global_cache.h
# src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
2019-03-16 00:38:48 -04:00
bunnei
2eaf6c41a4
gpu: Use host address for caching instead of guest address.
2019-03-14 22:34:42 -04:00
bunnei
633ce92908
Merge pull request #2147 from ReinUsesLisp/texture-clean
...
shader_ir: Remove "extras" from the MetaTexture
2019-03-10 17:28:36 -04:00
bunnei
7b574f406b
gpu: Move command processing to another thread.
2019-03-06 21:48:57 -05:00
Lioncash
f9ee0dc7ee
video_core/engines: Remove unnecessary includes
...
Removes a few unnecessary dependencies on core-related machinery, such
as the core.h and memory.h, which reduces the amount of rebuilding
necessary if those files change.
This also uncovered some indirect dependencies within other source
files. This also fixes those.
2019-03-05 20:35:32 -05:00
bunnei
f15e2dd881
Merge pull request #2163 from ReinUsesLisp/bitset-dirty
...
maxwell_3d: Use std::bitset to manage dirty flags
2019-02-27 20:50:08 -05:00
Lioncash
b9238edd0d
common/math_util: Move contents into the Common namespace
...
These types are within the common library, so they should be within the
Common namespace.
2019-02-27 03:38:39 -05:00
ReinUsesLisp
5219edd715
maxwell_3d: Use std::bitset to manage dirty flags
2019-02-26 03:01:48 -03:00
ReinUsesLisp
5ca63d0675
shader/decode: Remove extras from MetaTexture
2019-02-26 00:11:30 -03:00
ReinUsesLisp
48e6f77c03
shader/decode: Split memory and texture instructions decoding
2019-02-26 00:11:30 -03:00
bunnei
c07987dfab
Merge pull request #2118 from FernandoS27/ipa-improve
...
shader_decompiler: Improve Accuracy of Attribute Interpolation.
2019-02-24 23:04:22 -05:00
Lioncash
a8fa5019b5
video_core: Remove usages of System::GetInstance() within the engines
...
Avoids the use of the global accessor in favor of explicitly making the
system a dependency within the interface.
2019-02-15 22:06:23 -05:00
Lioncash
bd983414f6
core_timing: Convert core timing into a class
...
Gets rid of the largest set of mutable global state within the core.
This also paves a way for eliminating usages of GetInstance() on the
System class as a follow-up.
Note that no behavioral changes have been made, and this simply extracts
the functionality into a class. This also has the benefit of making
dependencies on the core timing functionality explicit within the
relevant interfaces.
2019-02-15 21:50:25 -05:00
Fernando Sahmkow
10682ad7e0
shader_decompiler: Improve Accuracy of Attribute Interpolation.
2019-02-14 03:25:07 -04:00
bunnei
8135f4bfce
Merge pull request #2110 from lioncash/namespace
...
core_timing: Rename CoreTiming namespace to Core::Timing
2019-02-12 19:26:37 -05:00
bunnei
c440ecfafe
Merge pull request #2104 from ReinUsesLisp/compute-assert
...
kepler_compute: Fixup assert and rename the engine
2019-02-12 19:24:34 -05:00
Lioncash
48d9d66dc5
core_timing: Rename CoreTiming namespace to Core::Timing
...
Places all of the timing-related functionality under the existing Core
namespace to keep things consistent, rather than having the timing
utilities sitting in its own completely separate namespace.
2019-02-12 12:42:17 -05:00
Fernando Sahmkow
f5ec165e8c
Corrected F2I None mode to RoundEven.
2019-02-11 18:46:45 -04:00
ReinUsesLisp
1ddcd0e6f0
kepler_compute: Fixup assert and rename engines
...
When I originally added the compute assert I used the wrong
documentation. This addresses that.
The dispatch register was tested with homebrew against hardware and is
triggered by some games (e.g. Super Mario Odyssey). What exactly is
missing to get a valid program bound by this engine requires more
investigation.
2019-02-10 19:29:33 -03:00
bunnei
dd1aab5446
gl_rasterizer: Implement a more accurate fermi 2D copy.
...
- This is a blit, use the blit registers.
2019-02-06 21:54:21 -05:00
bunnei
10ab714fe0
Merge pull request #2042 from ReinUsesLisp/nouveau-tex
...
maxwell_3d: Allow texture handles with TIC id zero
2019-02-06 20:19:20 -05:00
bunnei
72c70d6808
Merge pull request #2081 from ReinUsesLisp/lmem-64
...
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
bunnei
bb4549a73d
Merge pull request #2082 from FernandoS27/txq-stl
...
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00
Mat M
a568cd805b
Update src/video_core/engines/shader_bytecode.h
...
Co-Authored-By: FernandoS27 <fsahmkow27@gmail.com>
2019-02-03 21:27:26 -04:00
Fernando Sahmkow
0306c50339
Fix TXQ not using the component mask.
2019-02-03 18:17:18 -04:00
ReinUsesLisp
2bdbb90af7
video_core: Assert on invalid GPU to CPU address queries
2019-02-03 04:58:40 -03:00
ReinUsesLisp
04e68e9738
maxwell_3d: Allow sampler handles with TSC id zero
2019-02-03 04:58:40 -03:00
ReinUsesLisp
390721a561
maxwell_3d: Allow texture handles with TIC id zero
...
Also remove "enabled" field from Tegra::Texture::FullTextureInfo because
it would become unused.
2019-02-03 04:58:24 -03:00
ReinUsesLisp
9feb68085d
shader_bytecode: Rename BytesN enums to BitsN
2019-02-03 00:25:40 -03:00
ReinUsesLisp
477d616f7d
shader_ir: Unify constant buffer offset values
...
Constant buffer values on the shader IR were using different offsets if
the access direct or indirect. cbuf34 has a non-multiplied offset while
cbuf36 does. On shader decoding this commit multiplies it by four on
cbuf34 queries.
2019-01-30 02:45:50 -03:00
ReinUsesLisp
3b84e04af1
shader_decode: Implement LDG and basic cbuf tracking
2019-01-30 00:00:15 -03:00
bunnei
1f4ca1e841
Merge pull request #1927 from ReinUsesLisp/shader-ir
...
video_core: Replace gl_shader_decompiler with an IR based decompiler
2019-01-25 23:42:14 -05:00
ReinUsesLisp
9a82dec74a
maxwell_3d: Set rt_separate_frag_data to 1 by default
...
Commercial games assume that this value is 1 but they never set it. On
the other hand nouveau manually sets this register. On
ConfigureFramebuffers we were asserting for what we are actually
implementing (according to envytools).
2019-01-22 04:14:29 -03:00
ReinUsesLisp
a1b845b651
shader_decode: Implement VMAD and VSETP
2019-01-15 17:54:53 -03:00
ReinUsesLisp
dd91650aaf
shader_decode: Implement HFMA2
2019-01-15 17:54:52 -03:00