Commit Graph

808 Commits

Author SHA1 Message Date
bunnei
9864da7d43
Merge pull request #4524 from lioncash/memory-log
shader/memory: Amend UNIMPLEMENTED_IF_MSG without a message
2020-08-27 00:16:10 -04:00
Lioncash
bafef3d1c9 async_shaders: Mark getters as const member functions
While we're at it, we can also mark them as nodiscard.
2020-08-24 01:15:50 -04:00
David
cbaf1bc711
Merge pull request #4443 from ameerj/vk-async-shaders
vulkan_renderer: Async shader/graphics pipeline compilation
2020-08-17 15:06:11 +10:00
ameerj
fde8102a41 Remove unneeded newlines, optional Registry in shader params
Addressing feedback from Rodrigo
2020-08-16 16:33:21 -04:00
Ameer J
f49ffdd648 Morph: Update worker allocation comment
Co-authored-by: Morph <39850852+Morph1984@users.noreply.github.com>
2020-08-16 12:02:22 -04:00
ameerj
1b829fbd7a move thread 1/4 count computation into allocate workers method 2020-08-16 12:02:22 -04:00
ameerj
31a76410e8 Address feedback, add shader compile notifier, update setting text 2020-08-16 12:02:22 -04:00
ameerj
c02464f64e Vk Async Worker directly emplace in cache 2020-08-16 12:02:22 -04:00
ameerj
4539073ce1 Address feedback. Bruteforce delete duplicates 2020-08-16 12:02:22 -04:00
ameerj
6ac97405df Vk Async pipeline compilation 2020-08-16 12:02:22 -04:00
Lioncash
dcc5562cd5 shader/memory: Amend UNIMPLEMENTED_IF_MSG without a message
We need to provide a message for this variant of the macro, so we can
simply log out the type being used.
2020-08-14 08:38:37 -04:00
Lioncash
6b13d08822 async_shaders: Resolve -Wpessimizing-move warning
Prevents pessimization of the move constructor (which thankfully didn't
actually happen in practice here, given std::thread isn't copyable).
2020-08-14 08:16:50 -04:00
Lioncash
b724a4d90c General: Tidy up clang-format warnings part 2 2020-08-13 14:19:08 -04:00
bunnei
f650cf8a9a
Merge pull request #4391 from lioncash/nrvo
video_core: Allow copy elision to take place where applicable
2020-07-24 06:33:09 -07:00
Rodrigo Locatti
9ea9a60e17
Merge pull request #4361 from ReinUsesLisp/lane-id
decode/other: Implement S2R.LaneId
2020-07-21 04:50:45 -03:00
Lioncash
6adc824d9d video_core: Allow copy elision to take place where applicable
Removes const from some variables that are returned from functions, as
this allows the move assignment/constructors to execute for them.
2020-07-21 00:36:13 -04:00
bunnei
3d13d7f48f
Merge pull request #4324 from ReinUsesLisp/formats
video_core: Fix, add and rename pixel formats
2020-07-21 00:13:04 -04:00
David Marcec
967307d3be Fix style issues 2020-07-18 14:24:32 +10:00
David Marcec
85b591f6f0 Remove duplicate config 2020-07-17 14:26:18 +10:00
David Marcec
f48187449e Use conditional var 2020-07-17 14:26:17 +10:00
David Marcec
468bd9c1b0 async shaders 2020-07-17 14:24:57 +10:00
ReinUsesLisp
210cc0204d decode/other: Implement S2R.LaneId
This maps to host's thread id.

- Fixes graphical issues on Paper Mario.
2020-07-16 16:09:39 -03:00
ReinUsesLisp
fbc232426d video_core: Rearrange pixel format names
Normalizes pixel format names to match Vulkan names. Previous to this
commit pixel formats had no convention, leading to confusion and
potential bugs.
2020-07-13 01:44:23 -03:00
bunnei
efef7b1517
Merge pull request #4147 from ReinUsesLisp/hset2-imm
shader/half_set: Implement HSET2_IMM
2020-06-26 23:14:56 -04:00
bunnei
2f2df9a4a7
Merge pull request #4083 from Morph1984/B10G11R11F
decode/image: Implement B10G11R11F
2020-06-24 11:02:38 -04:00
ReinUsesLisp
39ab33ee1c shader/half_set: Implement HSET2_IMM
Add HSET2_IMM. Due to the complexity of the encoding avoid using
BitField unions and read the relevant bits from the code itself.
This is less error prone.
2020-06-22 20:51:18 -03:00
Morph
480e1fa987 decode/image: Implement B10G11R11F
- Used by Kirby Star Allies
2020-06-20 00:28:30 -04:00
MerryMage
442e48ef4c memory_util: boost hashes are size_t
* boost::hash_value returns a size_t
* boost::hash_combine takes a size_t& argument
2020-06-18 15:47:43 +01:00
ReinUsesLisp
5b2b6d594c shader/texture: Join separate image and sampler pairs offline
Games using D3D idioms can join images and samplers when a shader
executes, instead of baking them into a combined sampler image. This is
also possible on Vulkan.

One approach to this solution would be to use separate samplers on
Vulkan and leave this unimplemented on OpenGL, but we can't do this
because there's no consistent way of determining which constant buffer
holds a sampler and which one an image. We could in theory find the
first bit and if it's in the TIC area, it's an image; but this falls
apart when an image or sampler handle use an index of zero.

The used approach is to track for a LOP.OR operation (this is done at an
IR level, not at an ISA level), track again the constant buffers used as
source and store this pair. Then, outside of shader execution, join
the sample and image pair with a bitwise or operation.

This approach won't work on games that truly use separate samplers in a
meaningful way. For example, pooling textures in a 2D array and
determining at runtime what sampler to use.

This invalidates OpenGL's disk shader cache :)

- Used mostly by D3D ports to Switch
2020-06-05 00:24:51 -03:00
ReinUsesLisp
e1438f8e91 shader/track: Move bindless tracking to a separate function 2020-06-04 23:02:55 -03:00
LC
9a0c1456e3
Merge pull request #4016 from ReinUsesLisp/invocation-info
shader/other: Fix hardcoded value in S2R INVOCATION_INFO
2020-06-02 09:47:53 -04:00
ReinUsesLisp
f2d1aa97ad shader/other: Fix hardcoded value in S2R INVOCATION_INFO
Geometry shaders built from Nvidia's compiler check for bits[16:23] to
be less than or equal to 0 with VSETP to default to a "safe" value of
0x8000'0000 (safe from hardware's perspective). To avoid hitting this
path in the shader, return 0x00ff'0000 from S2R INVOCATION_INFO.

This seems to be the maximum number of vertices a geometry shader can
emit in a primitive.
2020-05-30 01:49:14 -03:00
ReinUsesLisp
32e6727dae shader/other: Implement MEMBAR.CTS
This silences an assertion we were hitting and uses workgroup memory
barriers when the game requests it.
2020-05-27 00:19:45 -03:00
bunnei
508242c267
Merge pull request #3981 from ReinUsesLisp/bar
shader/other: Implement BAR.SYNC 0x0
2020-05-26 14:40:13 -04:00
bunnei
623d9c47a2
Merge pull request #3980 from ReinUsesLisp/red-op
shader/memory: Implement non-addition operations in RED
2020-05-26 12:50:41 -04:00
ReinUsesLisp
5d0986a53b shader/other: Implement BAR.SYNC 0x0
Trivially implement this particular case of BAR. Unless games use OpenCL
or CUDA barriers, we shouldn't hit any other case here.
2020-05-21 23:20:43 -03:00
ReinUsesLisp
103809a0ca shader/memory: Implement non-addition operations in RED
Trivially implement these instructions. They are used in Astral Chain.
2020-05-21 23:19:46 -03:00
ReinUsesLisp
e2b67a868b shader/other: Implement thread comparisons (NV_shader_thread_group)
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially
implement these using Nvidia's extension on OpenGL or naively stubbing
them with the ARB instructions to match. This might cause issues if the
host device warp size doesn't match Nvidia's. That said, this is
unlikely on proper shaders.

Refer to the attached url for more documentation about these flags.
https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
2020-05-21 23:18:37 -03:00
ReinUsesLisp
4e57f9d5cf shader_ir: Separate float-point comparisons in ordered and unordered
This allows us to use native SPIR-V instructions without having to
manually check for NAN.
2020-05-09 04:55:15 -03:00
bunnei
e6b4311178
Merge pull request #3693 from ReinUsesLisp/clean-samplers
shader/texture: Support multiple unknown sampler properties
2020-05-02 00:45:41 -04:00
bunnei
c7b5a87c90
Merge pull request #3799 from ReinUsesLisp/iadd-cc
shader: Implement P2R CC, IADD Rd.CC and IADD.X
2020-04-30 12:56:36 -04:00
bunnei
6572660fde
Merge pull request #3788 from FernandoS27/revert
Revert: shader_decode: Fix LD, LDG when track constant buffer.
2020-04-30 12:55:39 -04:00
ReinUsesLisp
871aadbe36 shader/arithmetic_integer: Fix tracking issue in temporary
This temporary is not needed as we mark Rd.CC + IADD.X as unimplemented.
It caused issues when tracking global buffers.
2020-04-28 17:14:53 -03:00
bunnei
72b73d22ab
Merge pull request #3784 from ReinUsesLisp/shader-memory-util
shader/memory_util: Deduplicate code
2020-04-28 12:05:50 -04:00
ReinUsesLisp
ddd82ef42b shader/memory_util: Deduplicate code
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as
well as shader decoder code.

While we are at it, fix a bug in gl_shader_cache where compute shaders
had an start offset of a stage shader.
2020-04-26 01:38:51 -03:00
ReinUsesLisp
e895a4e2d7 shader/arithmetic_integer: Fix edge case and mark IADD.X Rd.CC as unimplemented
IADD.X Rd.CC requires some extra logic that is not currently
implemented. Abort when this is hit.
2020-04-25 22:58:33 -03:00
ReinUsesLisp
2a96bea6a7 shader/arithmetic_integer: Change IAdd to UAdd to avoid signed overflow
Signed integer addition overflow might be undefined behavior. It's free
to change operations to UAdd and use unsigned integers to avoid
potential bugs.
2020-04-25 22:57:54 -03:00
ReinUsesLisp
c788f9c0bd shader/arithmetic_integer: Implement IADD.X
IADD.X takes the carry flag and adds it to the result. This is generally
used to emulate 64-bit operations with 32-bit registers.
2020-04-25 22:56:11 -03:00
ReinUsesLisp
255197e643 shader/arithmetic_integer: Implement CC for IADD 2020-04-25 22:55:26 -03:00
ReinUsesLisp
ffc5ec6fa8 decode/register_set_predicate: Implement CC
P2R CC takes the state of condition codes and puts them into a register.
We already have this implemented for PR (predicates). This commit
implements CC over that.
2020-04-25 22:54:42 -03:00
ReinUsesLisp
d523734266 decode/register_set_predicate: Use move for shared pointers
Avoid atomic counters used by shared pointers.
2020-04-25 22:54:14 -03:00
bunnei
4e37825dab
Merge pull request #3734 from ReinUsesLisp/half-float-mods
decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
2020-04-25 00:41:43 -04:00
bunnei
7c8acb0025
Merge pull request #3749 from ReinUsesLisp/lea-imm
shader/arithmetic_integer: Fix LEA_IMM encoding
2020-04-24 14:30:13 -04:00
Fernando Sahmkow
d8a961cd6c Revert: shader_decode: Fix LD, LDG when track constant buffer. 2020-04-24 11:00:54 -04:00
ReinUsesLisp
dbaebd8582 decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
The encoding for negation and absolute value was wrong.
Extracting is now done manually. Similar instructions having different
encodings is the rule, not the exception. To keep sanity and readability
I preferred to extract the desired bit manually.

This is implemented against nxas:
8dbc389957/table.h (L68)

That is itself tested against nvdisasm (Nvidia's official disassembler).
2020-04-23 18:29:38 -03:00
ReinUsesLisp
4fb921ff6b shader/texture: Support multiple unknown sampler properties
This allows deducing some properties from the texture instruction before
asking the runtime. By doing this we can handle type mismatches in some
instructions from the renderer instead of the shader decoder.

Fixes texelFetch issues with games using 2D texture instructions on a 1D
sampler.
2020-04-23 18:04:13 -03:00
ReinUsesLisp
72deb773fd shader_ir: Turn classes into data structures 2020-04-23 18:00:06 -03:00
bunnei
2409fedacf
Merge pull request #3697 from lioncash/declarations
CMakeLists: Enable -Wmissing-declarations on Linux builds
2020-04-23 02:18:52 -04:00
bunnei
9bf3abcb63
Merge pull request #3698 from lioncash/warning
General: Resolve minor assorted warnings
2020-04-21 14:11:18 -04:00
ReinUsesLisp
8734ccb0cb shader/arithmetic_integer: Fix LEA_IMM encoding
The operand order in LEA_IMM was flipped compared to nvdisasm. Fix that
using nxas as reference:

8dbc389957/table.h (L122)
2020-04-20 21:54:59 -03:00
bunnei
73db83c0ab
Merge pull request #3679 from lioncash/track
track: Eliminate redundant copies
2020-04-19 01:22:47 -04:00
Lioncash
e2d8be1ca2 General: Resolve warnings related to missing declarations 2020-04-16 23:43:34 -04:00
Lioncash
678ac54749 decode/memory: Resolve unused variable warning
Only the first element of the returned pair is ever used.
2020-04-16 22:45:44 -04:00
Lioncash
d159643fd7 decode/texture: Resolve unused variable warnings.
Some variables aren't used, so we can remove these.

Unfortunately, diagnostics are still reported on structured bindings
even when annotated with [[maybe_unused]], so we need to unpack the
elements that we want to use manually.
2020-04-16 22:45:41 -04:00
Lioncash
f522abd8ab decode/texture: Collapse loop down into std::generate
Same behavior, less code.
2020-04-16 22:29:07 -04:00
Lioncash
7e2d60de26 decode/texture: Eliminate trivial missing field initializer warnings
We can just specify the initializers.
2020-04-16 22:27:21 -04:00
bunnei
79c1269f0f
Merge pull request #3673 from lioncash/extra
CMakeLists: Specify -Wextra on linux builds
2020-04-16 21:12:33 -04:00
Rodrigo Locatti
a5a2ee8766
Merge pull request #3689 from lioncash/unused-var
decode/shift: Remove unused variable within Shift()
2020-04-16 02:05:54 -03:00
Lioncash
cd2a12e78f decode/shift: Remove unused variable within Shift()
Removes a redundant variable that is already satisfied by the IsFull()
utility function.
2020-04-16 00:16:06 -04:00
Lioncash
72a224d3fc control_flow: Make use of std::move in TryInspectAddress()
Eliminates redundant atomic reference count increments and decrements.
2020-04-15 23:31:22 -04:00
Lioncash
24620bc4ea decode/image: Fix typo in assert in GetComponentSize() 2020-04-15 22:29:51 -04:00
Lioncash
b178c9a349 decoder/image: Fix incorrect G24R8 component sizes in GetComponentSize()
The components' sizes were mismatched. This corrects that.
2020-04-15 22:10:44 -04:00
Lioncash
e15ec2705c track: Eliminate redundant copies
Two variables can be references, while two others can be std::moved.
Makes for 4 less atomic reference count increments and decrements.
2020-04-15 21:50:09 -04:00
Lioncash
1c340c6efa CMakeLists: Specify -Wextra on linux builds
Allows reporting more cases where logic errors may exist, such as
implicit fallthrough cases, etc.

We currently ignore unused parameters, since we currently have many
cases where this is intentional (virtual interfaces).

While we're at it, we can also tidy up any existing code that causes
warnings. This also uncovered a few bugs as well.
2020-04-15 21:33:46 -04:00
Fernando Sahmkow
e33196d4e7
Merge pull request #3612 from ReinUsesLisp/red
shader/memory: Implement RED.E.ADD and minor changes to ATOM
2020-04-15 15:03:49 -04:00
ReinUsesLisp
fefe7f18f9 shader/arithmetic: Add FCMP_CR variant
Adds another variant of FCMP.
2020-04-14 19:11:04 -03:00
Mat M
7b62212461
Merge pull request #3619 from ReinUsesLisp/i2i
shader/conversion: Implement I2I sign extension, saturation and selection
2020-04-13 10:17:07 -04:00
Mat M
47036859eb
Merge pull request #3633 from ReinUsesLisp/clean-texdec
shader/texture: Remove type mismatches management from shader decoder
2020-04-13 10:13:05 -04:00
Fernando Sahmkow
3d91dbb21d
Merge pull request #3578 from ReinUsesLisp/vmnmx
shader/video: Partially implement VMNMX
2020-04-12 10:44:03 -04:00
ReinUsesLisp
76f178ba6e shader/video: Partially implement VMNMX
Implements the common usages for VMNMX. Inputs with a different size
than 32 bits are not supported and sign mismatches aren't supported
either.

VMNMX works as follows:
It grabs Ra and Rb and applies a maximum/minimum on them (this is
defined by .MX), having in mind the input sign. This result can then be
saturated. After the intermediate result is calculated, it applies
another operation on it using Rc. These operations are merges,
accumulations or another min/max pass.

This instruction allows to implement with a more flexible approach GCN's
min3 and max3 instructions (for instance).
2020-04-12 00:34:42 -03:00
ReinUsesLisp
a87b16da9a shader/texture: Remove type mismatches management from shader decoder
Since commit e22816a5bb we handle type mismatches from the CPU.
We don't need to hack our shader decoder due to game bugs anymore.

Removed in this commit.
2020-04-10 00:57:32 -03:00
bunnei
b96fd0bd0e
Merge pull request #3601 from ReinUsesLisp/some-shader-encodings
video_core/shader: Add some instruction and S2R encodings
2020-04-09 00:17:39 -04:00
Rodrigo Locatti
487f9ba525
Merge pull request #3489 from namkazt/patch-2
shader: implement SULD.D bits32/64
2020-04-07 16:21:09 -03:00
Nguyen Dac Nam
935648ffa9
address nit. 2020-04-07 18:29:30 +07:00
ReinUsesLisp
da706cad25 shader/conversion: Implement I2I sign extension, saturation and selection
Reimplements I2I adding sign extension, saturation (clamp source value
to the destination), selection and destination sizes that are not 32
bits wide.

It doesn't implement CC yet.
2020-04-07 02:19:44 -03:00
Nguyen Dac Nam
bf1174c114
Apply suggestions from code review
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2020-04-07 07:55:49 +07:00
namkazy
2c98e14d13 shader_decode: SULD.D using std::pair instead of out parameter 2020-04-06 13:46:55 +07:00
namkazy
9efa51311f shader_decode: SULD.D avoid duplicate code block. 2020-04-06 13:34:06 +07:00
namkazy
7f5696513f shader_decode: SULD.D fix conversion error. 2020-04-06 13:26:58 +07:00
namkazy
2906372ba1 shader_decode: SULD.D implement bits64 and reverse shader ir init method to removed shader stage. 2020-04-06 13:09:19 +07:00
ReinUsesLisp
3185245845 shader/memory: Implement RED.E.ADD
Implements a reduction operation. It's an atomic operation that doesn't
return a value.

This commit introduces another primitive because some shading languages
might have a primitive for reduction operations.
2020-04-06 02:24:47 -03:00
ReinUsesLisp
fd0a2b5151 shader/memory: Add "using std::move" 2020-04-06 02:18:14 -03:00
ReinUsesLisp
79970c9174 shader/memory: Minor fixes in ATOM 2020-04-06 00:54:22 -03:00
Fernando Sahmkow
69277de29d
Merge pull request #3592 from ReinUsesLisp/ipa
shader_decompiler: Remove FragCoord.w hack and change IPA implementation
2020-04-05 19:29:40 -04:00
namkazy
730f9b55b3 silent warning (conversion error) 2020-04-05 16:02:07 +07:00
namkazy
9f6ebccf06 shader_decode: SULD.D -> SINT actually same as UNORM. 2020-04-05 15:18:42 +07:00
namkazy
6f2b7087c2 shader_decode: SULD.D fix decode SNORM component 2020-04-05 14:46:43 +07:00
namkazy
69657ff19c clang-format 2020-04-05 12:57:50 +07:00
namkazy
24cc64c5b3 shader_decode: get sampler descriptor from registry. 2020-04-05 12:54:48 +07:00
namkazy
acd3f0ab37 tweaking. 2020-04-05 10:31:32 +07:00