Initialize Matrix Elements
Create both serial and CUDA parallel programs based on the following code segment, which multiplies the transpose of a matrix with itself:double A, C; // insert code to initialize matrix elements to random values between 1.0 and 2.0 for (i = 0; i < 4096; i++) for (j = 0; j < 4096; j++) for (k = 0; k < 4096; k++) C[ i ][ j ] += A[ k ][ i ] * A[ k ][ j ];Use the code as above for your Serial version on OSC using 1 node with 12 processors (full node). Use whatever techniques you feel appropriate to design a Parallel version. a) Report your results in estimated GFlops. b) Measure both serial and parallel performance. c) Report the CUDA compute structure (Grid, Block and Thread) you used and explain your results.Part 2Implement both serial and CUDA program to perform Sobel operator for edge detection, on a given set of images.BackgroundThe Sobel operator performs a 2-D spatial gradient measurement on images. The Sobel edge detector uses a pair of 3 x 3 stencils, or “convolution masks,” one estimating gradient in the x-direction and the other estimating gradient in y-direction. The Sobel detector is incredibly sensitive to noise in pictures, it effectively highlights them as edges.