Small update: I managed to get a speedup of more than 8-10 using GPU with large number of repetitions.

More update: Using single precision floating point operations gives 2x 240 MS/s with the GPU.

% This program compares the difference in speed between calculating

% a cross correlation of 2x8 MS vectors using CPU and GPU

% on i7-2600K @ 3.4 GHz with 8 GB of RAM vs. GeForce GTX 670

% the speedup is around 4.9, cross correlation can be efficiently

% calculated through the use of fast fourier transform.

% If only cross-spectrum is of interest then inverse fft is

% unnecessary and some time can be saved.

clear all;

a = randn(2^23, 1);

b = randn(2^23, 1);

tic;

A = fft(a);

B = fft(b);

C = ifft(times(conj(A), B), 'symmetric'); % xcorr of CH1 and CH2

aa = dot(a, a); % zero delay autocorr of CH1

bb = dot(b, b); % zero delay autocorr of CH2

timecpu = toc

tic;

a = gpuArray(a);

b = gpuArray(b);

A = fft(a);

B = fft(b);

C = ifft(times(conj(A), B), 'symmetric');

aa = dot(a, a);

bb = dot(b, b);

C = gather(C);

aa = gather(aa);

bb = gather(bb);

timegpu = toc

speed = timecpu/timegpu

---------------

N=21;

n=100;

n1 = randn(2^N, 1);

n2 = randn(2^N, 1);

tic;

for z=1:n

a = gpuArray(n1);

b = gpuArray(n2);

A = fft(a);

B = fft(b);

C = ifft(times(conj(A), B), 'symmetric');

C = gather(C); % if you don't gather -> 96 MSPS (x2)

end

timegpu = toc

speed = n*2^N/timegpu/1e6

timegpu =

2.8006

speed =

74.8825

2x 75 MS/s, if you don't gather (for example just average on the GPU) 2x 96 MS/s, 2x240 MS/s with single precision floating point instructions ... randn(2^N, 1, 'single').

http://en.wikipedia.org/wiki/Cross_correlation

--------------------

Intel Core i7-2600K (4x 3.4 GHz)

GeForce GTX 670 (1344 CUDA-cores, 915 MHz, 6 GHz memory)

Averaging performance including data transfer overheads :

CPU single precision dot-product (1-point xcorr) 2x 600 MS/s

CPU double precision dot-product (1-point xcorr) 2x 300 MS/s

CPU single precision cross-correlation 2x 23 MS/s

CPU double precision cross-correlation 2x 13 MS/s

GPU single precision dot-product (1-point xcorr) 2x 433 MS/s

GPU double precision dot-product (1-point xcorr) 2x 227 MS/s

GPU single precision cross-correlation 2x 225 MS/s

GPU double precision cross-correlation 2x 90 MS/s

--------------------

In the spirit of the Hobbit... |

% Simple RSA cryptosystem testing in matlab

one = java.math.BigInteger.ONE;

p = one.probablePrime(256, java.util.Random);

q = one.probablePrime(256, java.util.Random);

n = p.multiply(q); % first part of the public key (n, e)

tmp0 = p.subtract(one);

tmp1 = q.subtract(one);

totient = tmp0.multiply(tmp1);

tmp0 = java.math.BigInteger('2');

tmp1 = totient.divide(tmp0);

e = tmp1.nextProbablePrime(); % second part of the public key (n, e)

d = e.modInverse(totient); % the private key

m = java.math.BigInteger('123321243425345'); % message

c = m.modPow(e, n) % encryption

m = c.modPow(d, n) % decryption

http://en.wikipedia.org/wiki/RSA_(algorithm)

--------------------

% Fermat primality test

one = java.math.BigInteger.ONE;

p = java.math.BigInteger('221');

n = p.subtract(java.math.BigInteger.ONE);

a = one.nextProbablePrime();

a.modPow(n, p) % a^n mod p

a = a.nextProbablePrime();

a.modPow(n, p) % this needs to be always one if p is prime

% ...

http://en.wikipedia.org/wiki/Fermat_primality_test

http://en.wikipedia.org/wiki/AKS_primality_test

http://en.memory-alpha.org/wiki/Ode_to_Spot

-----------------

Efficient modpow algorithm for large numbers...

a = java.math.BigInteger('2988348162058574136915891421498819466320163312926952423791023078876139');

b = java.math.BigInteger('2351399303373464486466122544523690094744975233415544072992656881240319');

m = java.math.BigInteger('87814412832655818143772433337418883492663173730772486450699007318453048545183');

ZERO = java.math.BigInteger.ZERO;

ONE = java.math.BigInteger.ONE;

s = ONE;

t = a;

u = b;

while u.equals(ZERO) == 0

if u.and(ONE).equals(ONE) == 1

s = s.multiply(t).mod(m);

end

u = u.shiftRight(1)

t = t.multiply(t).mod(m);

end

a.modPow(b, m)

s

## No comments:

## Post a Comment