В.М.> Вы ребята не путайте т.н. "расчет физики" для игрушк и обычные инженерные расчты. Я с большим любопытсвоам послушаю о реально работающих системах расчeта например "сил-нагрузок-напряжний" МКЭ с использованием видеокарточек
Да дофига этого, читайте. Сразу скажу что этот народ знает что такое точность вычислений
Считают силы например в N-body problem, ускорение вычисления относительно CPU и спецвычислителя (да, они делали себе маспар спецвычислитель для ускорения, со спецпроцессорами и архитектурой — всё вот так серьёзно:). Результаты в работах приводятся.
High-performance direct gravitational N-body simulations on graphics processing units
New Astronomy, Volume 12, Issue 8, November 2007, Pages 641-650
Simon F. Portegies Zwart, Robert G. Belleman and Peter M. Geldof
Large steps in GPU-based deformable bodies simulation
Simulation Modelling Practice and Theory, Volume 13, Issue 8, November 2005, Pages 703-715
Eduardo Tejada and Thomas Ertl
Performance analysis of direct N-body algorithms on special-purpose supercomputers
New Astronomy, Volume 12, Issue 5, July 2007, Pages 357-377
Stefan Harfst, Alessia Gualandris, David Merritt, Rainer Spurzem, Simon Portegies Zwart and Peter Berczik
Implementation and performance evaluation of reconstruction algorithms on graphics processors
Journal of Structural Biology, Volume 157, Issue 1, January 2007, Pages 288-295
Daniel Castaño Díez, Hannes Mueller and Achilleas S. Frangakis
Visual simulation of shallow-water waves
Simulation Modelling Practice and Theory, Volume 13, Issue 8, November 2005, Pages 716-726
T.R. Hagen, J.M. Hjelmervik, K.-A. Lie, J.R. Natvig and M. Ofstad Henriksen
Quantum Monte Carlo on graphical processing units
Computer Physics Communications, Volume 177, Issue 3, 1 August 2007, Pages 298-306
Amos G. Anderson, William A. Goddard III and Peter Schröder
Lattice QCD as a video game
Computer Physics Communications, Volume 177, Issue 8, 15 October 2007, Pages 631-639
Győző I. Egri, Zoltán Fodor, Christian Hoelbling, Sándor D. Katz, Dániel Nógrádi and Kálmán K. Szabó
Mass-spring systems on the GPU
Simulation Modelling Practice and Theory, Volume 13, Issue 8, November 2005, Pages 693-702
Joachim Georgii and Rüdiger Westermann
и т.д., это только самые свежие.
з.ы. Вот детальные примеры:
High-performance direct gravitational N-body simulations on graphics processing units
10.1016/j.newast.2007.05.004
We present the results of gravitational direct N-body simulations using the commercial graphics processing units (GPU) NVIDIA Quadro FX1400 and GeForce 8800GTX, and compare the results with GRAPE-6Af special purpose hardware. The force evaluation of the N-body problem was implemented in Cg using the GPU directly to speed-up the calculations. The integration of the equations of motions were, running on the host computer, implemented in C using the 4th order predictor–corrector Hermite integrator with block time steps. We find that for a large number of particles ( N~ >10
4 ) modern graphics processing units offer an attractive low cost alternative to GRAPE special purpose hardware. A modern GPU continues to give a relatively flat scaling with the number of particles, comparable to that of the GRAPE. The GRAPE is designed to reach double precision, whereas the GPU is intrinsically single-precision. For relatively large time steps, the total energy of the N-body system was conserved better than to one in 10
6 on the GPU, which is impressive given the single-precision nature of the GPU. For the same time steps, the GRAPE gave somewhat more accurate results, by about an order of magnitude. However, smaller time steps allowed more energy accuracy on the grape, around 10^−11, whereas for the GPU machine precision saturates around 10^−6 . For N~>10
6 the
GeForce 8800GTX was about 20 times faster than the host computer. Though still about a factor of a few slower than GRAPE, modern GPUs outperform GRAPE in their low cost, long mean time between failure and the much larger onboard memory; the GRAPE-6Af holds at most 256k particles whereas the GeForce 8800GTX can hold 9 million particles in memory.
Lattice QCD as a video game
10.1016/j.cpc.2007.06.005
The speed, bandwidth and cost characteristics of today's PC graphics cards make them an attractive target as general purpose computational platforms. High performance can be achieved also for lattice simulations but the actual implementation can be cumbersome. This paper outlines the architecture and programming model of modern graphics cards for the lattice practitioner with the goal of exploiting these chips for Monte Carlo simulations. Sample code is also given.
Fig. 1 and Fig. 2 show sustained performances for both Wilson and staggered matrix multiplication on various lattice sizes and a comparison is given with SSE optimized CPU codes on an Intel P4. ... For reference we give some numbers from Fig. 1 for the NVIDIA 8800 GTX card: 33 Gflops sustained performance on a 163×60 lattice using the Wilson kernel. ... As reference we list below the specification of the NVIDIA 8800 GTX card: 128 fragment processors, 1350 MHz clock speed, 768 MB DDR3 memory at 1800 MHz, 384 bit wide bus, 86.4 GB/s peak memory bandwidth.
Сравнивали с 3ГГц пнём-4 и оптимизированным под SSE2 кодом, в 15~20 медленнее он считает эти матрицы.
Вот там ещё референс возможно интересный:
IDAV: Publications
Mass-spring systems on the GPU
10.1016/j.simpat.2005.08.004
We present and analyze different implementations of mass-spring systems for interactive simulation of deformable surfaces on graphics processing units (GPUs). For the amount of springs we target, numerical time integration of spring displacements needs to be accelerated and the transfer of displaced point positions for rendering must be avoided. To fulfill these requirements, we exploit features of recent graphics accelerators to simulate spring elongation and compression on the GPU, saving displaced point masses in graphics memory, and then sending these positions through the GPU again to render the deformed surface. Two different simulation algorithms implementing scattering and gathering operations on the GPU are compared with respect to performance and numerical accuracy. We discuss GPU specific issues to be considered in simulation techniques showing similar computation and memory access patterns to mass-spring systems.
(В 32-бит точности считали они.)