David 
							
						 
					 
					
						
						
							
						
						9d69206cd0 
					 
					
						
						
							
							Merge pull request  #2870  from FernandoS27/multi-draw  
						
						... 
						
						
						
						Implement a MME Draw commands Inliner and correct host instance drawing 
						
						
					 
					
						2019-09-22 23:13:02 +10:00 
						 
				 
			
				
					
						
							
							
								Rodrigo Locatti 
							
						 
					 
					
						
						
							
						
						9286976948 
					 
					
						
						
							
							Merge pull request  #2878  from FernandoS27/icmp  
						
						... 
						
						
						
						shader_ir: Implement ICMP 
						
						
					 
					
						2019-09-21 18:06:07 -03:00 
						 
				 
			
				
					
						
							
							
								Fernando Sahmkow 
							
						 
					 
					
						
						
							
						
						527b841c15 
					 
					
						
						
							
							Shader_IR: ICMP corrections and fixes  
						
						
						
						
					 
					
						2019-09-21 14:28:03 -04:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						88d857499b 
					 
					
						
						
							
							Merge pull request  #2855  from ReinUsesLisp/shfl  
						
						... 
						
						
						
						shader_ir/warp: Implement SHFL for Nvidia devices 
						
						
					 
					
						2019-09-20 17:10:42 -04:00 
						 
				 
			
				
					
						
							
							
								Fernando Sahmkow 
							
						 
					 
					
						
						
							
						
						4b81d19a1a 
					 
					
						
						
							
							Shader_IR: Implement ICMP.  
						
						
						
						
					 
					
						2019-09-19 20:56:29 -04:00 
						 
				 
			
				
					
						
							
							
								Fernando Sahmkow 
							
						 
					 
					
						
						
							
						
						7606da5611 
					 
					
						
						
							
							VideoCore: Corrections to the MME Inliner and removal of hacky instance management.  
						
						
						
						
					 
					
						2019-09-19 11:41:29 -04:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						b31880dc5e 
					 
					
						
						
							
							Merge pull request  #2784  from ReinUsesLisp/smem  
						
						... 
						
						
						
						shader_ir: Implement shared memory 
						
						
					 
					
						2019-09-18 16:26:05 -04:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						0526bf1895 
					 
					
						
						
							
							shader_ir/warp: Implement SHFL  
						
						
						
						
					 
					
						2019-09-17 17:44:07 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						36abf67e79 
					 
					
						
						
							
							shader/image: Implement SUATOM and fix SUST  
						
						
						
						
					 
					
						2019-09-10 20:22:31 -03:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						34b2c60f95 
					 
					
						
						
							
							Merge pull request  #2823  from ReinUsesLisp/shr-clamp  
						
						... 
						
						
						
						shader/shift: Implement SHR wrapped and clamped variants 
						
						
					 
					
						2019-09-10 11:56:17 -04:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						1f43e5296f 
					 
					
						
						
							
							gl_shader_decompiler: Keep track of written images and mark them as modified  
						
						
						
						
					 
					
						2019-09-05 23:26:05 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						3a450c1395 
					 
					
						
						
							
							kepler_compute: Implement texture queries  
						
						
						
						
					 
					
						2019-09-05 20:35:51 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						4de04eba39 
					 
					
						
						
							
							shader_ir: Implement LD_S  
						
						... 
						
						
						
						Loads from shared memory. 
						
						
					 
					
						2019-09-05 01:38:37 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						f17415d431 
					 
					
						
						
							
							shader_ir: Implement ST_S  
						
						... 
						
						
						
						This instruction writes to a memory buffer shared with threads within
the same work group. It is known as "shared" memory in GLSL. 
						
						
					 
					
						2019-09-05 01:38:37 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						77ef4fa907 
					 
					
						
						
							
							shader/shift: Implement SHR wrapped and clamped variants  
						
						... 
						
						
						
						Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires. 
						
						
					 
					
						2019-09-04 01:55:24 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						dfae2d141a 
					 
					
						
						
							
							half_set_predicate: Fix predicate assignments  
						
						
						
						
					 
					
						2019-09-04 01:54:23 -03:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						81fbc5370d 
					 
					
						
						
							
							Merge pull request  #2812  from ReinUsesLisp/f2i-selector  
						
						... 
						
						
						
						shader_ir/conversion: Implement F2I and F2F F16 selector 
						
						
					 
					
						2019-09-03 22:35:33 -04:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						d4f33b822b 
					 
					
						
						
							
							Merge pull request  #2811  from ReinUsesLisp/fsetp-fix  
						
						... 
						
						
						
						float_set_predicate: Add missing negation bit for the second operand 
						
						
					 
					
						2019-09-03 22:34:34 -04:00 
						 
				 
			
				
					
						
							
							
								Rodrigo Locatti 
							
						 
					 
					
						
						
							
						
						4d4f9cc104 
					 
					
						
						
							
							video_core: Silent miscellaneous warnings  ( #2820 )  
						
						... 
						
						
						
						* texture_cache/surface_params: Remove unused local variable
* rasterizer_interface: Add missing documentation commentary
* maxwell_dma: Remove unused rasterizer reference
* video_core/gpu: Sort member declaration order to silent -Wreorder warning
* fermi_2d: Remove unused MemoryManager reference
* video_core: Silent unused variable warnings
* buffer_cache: Silent -Wreorder warnings
* kepler_memory: Remove unused MemoryManager reference
* gl_texture_cache: Add missing override
* buffer_cache: Add missing include
* shader/decode: Remove unused variables 
						
						
					 
					
						2019-08-30 14:08:00 -04:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						f8cc5668f8 
					 
					
						
						
							
							Merge pull request  #2758  from ReinUsesLisp/packed-tid  
						
						... 
						
						
						
						shader/decode: Implement S2R Tic 
						
						
					 
					
						2019-08-29 12:58:43 -04:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						e3534700d7 
					 
					
						
						
							
							shader_ir/conversion: Split int and float selector and implement F2F H1  
						
						
						
						
					 
					
						2019-08-28 16:09:33 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						b13fbc25b8 
					 
					
						
						
							
							shader_ir/conversion: Implement F2I F16 Ra.H1  
						
						
						
						
					 
					
						2019-08-27 23:40:40 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						6207751b00 
					 
					
						
						
							
							float_set_predicate: Add missing negation bit for the second operand  
						
						
						
						
					 
					
						2019-08-27 21:57:43 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						4e35177e23 
					 
					
						
						
							
							shader_ir: Implement VOTE  
						
						... 
						
						
						
						Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics 
Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.
To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:
* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true
ballotARB, also known as "uint64_t(activeThreadsNV())", emits
VOTE.ANY Rd, PT, PT;
on nouveau's compiler. This doesn't match exactly to Nvidia's code
VOTE.ALL Rd, PT, PT;
Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers. 
						
						
					 
					
						2019-08-21 14:50:38 -03:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						dfdd20142e 
					 
					
						
						
							
							Merge pull request  #2777  from ReinUsesLisp/hsetp2-fe3h-fix  
						
						... 
						
						
						
						half_set_predicate: Fix HSETP2_C constant buffer offset 
						
						
					 
					
						2019-08-21 10:29:17 -04:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						cedc1aab4a 
					 
					
						
						
							
							Merge pull request  #2753  from FernandoS27/float-convert  
						
						... 
						
						
						
						Shader_Ir: Implement F16 Variants of F2F, F2I, I2F. 
						
						
					 
					
						2019-08-21 10:27:57 -04:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						ca61e298b3 
					 
					
						
						
							
							Merge pull request  #2778  from ReinUsesLisp/nop  
						
						... 
						
						
						
						shader_ir: Implement NOP 
						
						
					 
					
						2019-08-18 08:51:34 -04:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						2ff8044806 
					 
					
						
						
							
							shader_ir: Implement NOP  
						
						
						
						
					 
					
						2019-08-04 03:02:55 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						ec0da3ef64 
					 
					
						
						
							
							half_set_predicate: Fix HSETP2_C constant buffer offset  
						
						
						
						
					 
					
						2019-08-04 02:50:55 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						77f1a676a1 
					 
					
						
						
							
							decode/half_set_predicate: Fix predicates  
						
						
						
						
					 
					
						2019-07-26 00:12:38 -03:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						b0ff3179ef 
					 
					
						
						
							
							Merge pull request  #2739  from lioncash/cflow  
						
						... 
						
						
						
						video_core/control_flow: Minor changes/warning cleanup 
						
						
					 
					
						2019-07-25 13:04:56 -04:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						4d26550f5f 
					 
					
						
						
							
							Merge pull request  #2737  from FernandoS27/track-fix  
						
						... 
						
						
						
						Shader_Ir: Correct tracking to track from right to left 
						
						
					 
					
						2019-07-25 12:41:52 -04:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						31e8a61527 
					 
					
						
						
							
							Merge pull request  #2743  from FernandoS27/surpress-assert  
						
						... 
						
						
						
						Downgrade and suppress a series of GPU asserts and debug messages. 
						
						
					 
					
						2019-07-25 12:34:36 -04:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						104641db07 
					 
					
						
						
							
							shader/decode: Implement S2R Tic  
						
						
						
						
					 
					
						2019-07-22 16:16:10 -03:00 
						 
				 
			
				
					
						
							
							
								Fernando Sahmkow 
							
						 
					 
					
						
						
							
						
						11f4e739bd 
					 
					
						
						
							
							Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.  
						
						... 
						
						
						
						This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done. 
						
						
					 
					
						2019-07-20 17:38:25 -04:00 
						 
				 
			
				
					
						
							
							
								Fernando Sahmkow 
							
						 
					 
					
						
						
							
						
						1158777737 
					 
					
						
						
							
							Shader_Ir: Change Debug Asserts for Log Warnings  
						
						
						
						
					 
					
						2019-07-19 22:15:34 -04:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						45c162444d 
					 
					
						
						
							
							shader/half_set_predicate: Fix HSETP2 implementation  
						
						
						
						
					 
					
						2019-07-19 22:21:22 -03:00 
						 
				 
			
				
					
						
							
							
								ReinUsesLisp 
							
						 
					 
					
						
						
							
						
						6c4985edc9 
					 
					
						
						
							
							shader/half_set_predicate: Implement missing HSETP2 variants  
						
						
						
						
					 
					
						2019-07-19 22:20:47 -03:00 
						 
				 
			
				
					
						
							
							
								Lioncash 
							
						 
					 
					
						
						
							
						
						c1c89411da 
					 
					
						
						
							
							video_core/control_flow: Provide operator!= for types with operator==  
						
						... 
						
						
						
						Provides operational symmetry for the respective structures. 
						
						
					 
					
						2019-07-18 21:03:31 -04:00 
						 
				 
			
				
					
						
							
							
								Lioncash 
							
						 
					 
					
						
						
							
						
						1780e0e3d0 
					 
					
						
						
							
							video_core/control_flow: Prevent sign conversion in TryGetBlock()  
						
						... 
						
						
						
						The return value is a u32, not an s32, so this would result in an
implicit signedness conversion. 
						
						
					 
					
						2019-07-18 21:03:31 -04:00 
						 
				 
			
				
					
						
							
							
								Lioncash 
							
						 
					 
					
						
						
							
						
						a162a844d2 
					 
					
						
						
							
							video_core/control_flow: Remove unnecessary BlockStack copy constructor  
						
						... 
						
						
						
						This is the default behavior of the copy constructor, so it doesn't need
to be specified.
While we're at it we can make the other non-default constructor
explicit. 
						
						
					 
					
						2019-07-18 21:03:30 -04:00 
						 
				 
			
				
					
						
							
							
								Lioncash 
							
						 
					 
					
						
						
							
						
						56bc11d952 
					 
					
						
						
							
							video_core/control_flow: Use std::move where applicable  
						
						... 
						
						
						
						Results in less work being done where avoidable. 
						
						
					 
					
						2019-07-18 21:03:30 -04:00 
						 
				 
			
				
					
						
							
							
								Lioncash 
							
						 
					 
					
						
						
							
						
						e7b39f47f8 
					 
					
						
						
							
							video_core/control_flow: Use the prefix variant of operator++ for iterators  
						
						... 
						
						
						
						Same thing, but potentially allows a standard library implementation to
pick a more efficient codepath. 
						
						
					 
					
						2019-07-18 21:03:30 -04:00 
						 
				 
			
				
					
						
							
							
								Lioncash 
							
						 
					 
					
						
						
							
						
						6885e7e7ec 
					 
					
						
						
							
							video_core/control_flow: Use empty() member function for checking emptiness  
						
						... 
						
						
						
						It's what it's there for. 
						
						
					 
					
						2019-07-18 21:03:30 -04:00 
						 
				 
			
				
					
						
							
							
								Lioncash 
							
						 
					 
					
						
						
							
						
						45fa12a05c 
					 
					
						
						
							
							video_core: Resolve -Wreorder warnings  
						
						... 
						
						
						
						Ensures that the constructor members are always initialized in the order
that they're declared in. 
						
						
					 
					
						2019-07-18 21:03:30 -04:00 
						 
				 
			
				
					
						
							
							
								Lioncash 
							
						 
					 
					
						
						
							
						
						47df844338 
					 
					
						
						
							
							video_core/control_flow: Make program_size for ScanFlow() a std::size_t  
						
						... 
						
						
						
						Prevents a truncation warning from occurring with MSVC. Also the
internal data structures already treat it as a size_t, so this is just a
discrepancy in the interface. 
						
						
					 
					
						2019-07-18 21:03:29 -04:00 
						 
				 
			
				
					
						
							
							
								Lioncash 
							
						 
					 
					
						
						
							
						
						3df9558593 
					 
					
						
						
							
							video_core/control_flow: Place all internally linked types/functions within an anonymous namespace  
						
						... 
						
						
						
						Previously, quite a few functions were being linked with external
linkage. 
						
						
					 
					
						2019-07-18 21:03:29 -04:00 
						 
				 
			
				
					
						
							
							
								Lioncash 
							
						 
					 
					
						
						
							
						
						1109db86b7 
					 
					
						
						
							
							video_core/shader/decode: Prevent sign-conversion warnings  
						
						... 
						
						
						
						Makes it explicit that the conversions here are intentional. 
						
						
					 
					
						2019-07-18 21:03:29 -04:00 
						 
				 
			
				
					
						
							
							
								bunnei 
							
						 
					 
					
						
						
							
						
						63bda67a34 
					 
					
						
						
							
							Merge pull request  #2738  from lioncash/shader-ir  
						
						... 
						
						
						
						shader-ir: Minor cleanup-related changes 
						
						
					 
					
						2019-07-18 13:52:01 -04:00 
						 
				 
			
				
					
						
							
							
								Fernando Sahmkow 
							
						 
					 
					
						
						
							
						
						5a06e33859 
					 
					
						
						
							
							Shader_Ir: correct clang format  
						
						
						
						
					 
					
						2019-07-18 10:09:26 -04:00