Can someone explain what the ternary operator is doing in this code? - ternary-operator

I found this example C code from "The Audio Programming Book".
I understand basically what the code is doing. It takes an array of values that represent the amplitude of series of sine waves and adds them together to create a complex wave.
I am OK with everything except the line with reads:
a = amps ? amps[i] : 1.f;
I know Ternary Operators are basically If/Else statement, but I cannot seem to figure out what this is doing exactly, because 'amps' is not defined earlier in the code. It doesn't make sense that amps is reusing amps[], it seem that would be a no no. I also haven't been able to find an example anywhere that matches up with this anywhere else.
But the code compiles, so I am completely baffled by what is it NOT wrong, and just what it is doing exactly.
If someone can explain what this is doing [is a traditional If/Else form] I would greatly appreciate it.
float* TableGEN::fourier_table(int harms, float *amps, int length, float phase)
{
float a;
float *table = new float[length+2];
double w;
phase *= (float)pi*2;
memset(table, 0, (length+2)*sizeof(float) );
for(int i=0; i < harms; i++)
for(int n=0; n < length+2; n++)
{
a = amps ? amps[i] : 1.f;
w = (i+1)*(n*2*pi/length);
table[n] += (float) (a*cos(w+phase));
}
normalise_table(table, length , 1.0f );
return table;
}
Thanks
Stan

It seems it's checking if amps is true and/or is set to something, and if it is, then grab the given index of it, else, return a float of 1.
So
if (amps)
{
a = amps[i];
}
else
{
a = 1.f;
}
Which is wonky/odd to be honest. It should really be checking if amps[i] is set, and then grab it. If not, then default to 1.f

Related

Atomic Saxpy in CUDA

I have the following problem in CUDA.
Let's assume we have a list of indices where some, or all, indices can be present more than one time:
inds = [1, 1, 1, 2, 2, 3, 4]
With these indices I would like to perform an atomic saxpy operation (in parallel) on a float array, x. I'm not worried about the order in which the operations are applied. That is, I want to do this, for floats a and k:
x[i] = x[i]*a + k;
This would be trivial if there were no duplicate indices in inds.
My current solution (that does not work) is this:
// assume all values in adr are greater than or equal to 0.
// also assume a and k are strictly positive.
__device__ inline void atomicSaxpy(float *adr,
const float a,
const float k){
float old = atomicExch(adr, -1.0f); // first exchange
float new_;
if (old <= -1.0f){
new_ = -1.0f;
} else {
new_ = old*a + k;
}
while (true) {
old = atomicExch(adr, new_); // second exchange
if (old <= -1.0f){
break;
}
new_ = old*a + k;
}
}
This seems to return the correct answer in many cases.
Here is what I think happens when you do not get the right answer:
old gets a value of -1.0f in the first exchange. => new_ = -1.0f
old gets a value of -1.0f in the second exchange as well.
The function exits without having had any external effect at all.
A somewhat different approach is this:
__device__ inline void atomicSaxpy(float *adr,
const float ia,
const float k){
float val;
while (true) {
val = atomicExch(adr, -1.0f);
if (val > 1.0f){
break;
}
atomicExch(adr, val*ia + k);
}
}
Which consistently deadlocks on my machine. Even for very simple inputs such as the example data above.
Is it possible to rewrite this function to behave properly?
Example Answer
Assuming k=0.1 and a=0.95, and with the the initial value of args as 0.5 for all indices, the result should be:
[0.5, 0.7139374999999998,
0.6462499999999999, 0.575, 0.575, ...]
I calculated these values using Python, they will probably look different in CUDA. This is an example of how the algorithm should behave, not a good sample set to run into the race condition issue.
Reference
Here is a thread where they implement atomicAdd (which already exists for floats at this point) using atomicExch:
https://devtalk.nvidia.com/default/topic/458062/atomicadd-float-float-atomicmul-float-float-/
An example looks like this:
__device__ inline void atomicAdd(float* address, float value) {
float old = value;
float new_old;
do {
new_old = atomicExch(address, 0.0f);
new_old += old;
}
while ((old = atomicExch(address, new_old)) != 0.0f);
};
This seems a little easier, and I can't quite see how to adapt it.
Other Solutions
Being able to solve this problem in this way has several advantages for my problem related to memory IO down the road. For this reason I would like to know if this is at all possible.
A possible different approach is to count the number of times each index occurs on the CPU, then perform a "regular" saxpy on the GPU after that. I'm assuming there are other possibilities as well, but I'm still interested in an answer to this particular problem.
If this was a non-parallel problem, you would simply do this:
*adr = *adr * a + k;
Since there are multiple threads operating on adr, we should read and write with atomic operations though.
float adrValue = atomicExch(adr, -1.0f)
float newValue = adrValue * a + k
atomicExch(adr, newValue)
However, we must be aware of the possibility that another thread has updated adr between our reading step (ln1) and our writing step (ln3).
So our 3-step operation as it is here is non-atomic.
To make it atomic, we should use compare-and-swap (atomicCAS) to ensure we only update memory if its value is unchanged since we read from it. And we can simply repeat our steps, in each iteration with the then-current value in adr as calcucation input, until step3 returns the expected lock-value -1.0f.
do {
float adrValue = atomicExch(adr, -1.0f)
float newValue = adrValue * a + k
adrValue = __int_to_float(atomicCAS(adr,
__float_as_int(-1.0f),
__float_as_int(newValue)))
} while (adrValue != -1.0f)
ps: consider the above pseudocode

How to chose a fixed clipping_gradients value [caffe]

In caffe.proto
// Set clip_gradients to >= 0 to clip parameter gradients to that L2 norm,
// whenever their actual L2 norm is larger.
optional float clip_gradients = 35 [default = -1];
I am having trouble setting the clipping_gradient, I think it should be dynamic anyway but if we are to chose a fixed number, how should we chose it? Is caffe setting it to 35? What does it mean?? I have experimented with a number of fixed choices but I see not much of a difference. I understand the exploding gradients / gradient clipping concept in the broad sense, however I am not sure how I should chose a fixed number in the solver.
You can print out the sum of the sum squared gradients for some iteration to get an idea about clip_gradients. This can be done this way:
net_->forward();
net_->backward();
const vector<Blob<Dtype>*>& net_params = net_->learnable_params();
float sumsq_diff = 0;
for (int i = 0; i < net_params.size(); ++i) {
sumsq_diff += net_params[i]->sumsq_diff();
}
std::cout<<"sum of gradient: "<<std::sqrt(sumsq_diff)<<"\n";
net_->update();
For details about how clip_gradients is used see solver.cpp.

Have issue with pretty simple C code

I am developing an app in XCode and have to write a bit of C for an algorithm. Here is a part of the C code:
double dataTag[M][N];
// dataTag initialized to values.....
double w[N]; // This is outside for loop at the top level of the method
for (int i = 0; i < N; i++) {
w[i] = pow(10.0, dataTag[2][i] / 10.0 / b);
}
//This is inside for loop.....
double disErr[N];
// disErr set and values confirmed with printArray...
double transedEstSetDrv[N][M];
// transedEstSetDrv set and values confirmed with printArray...
double stepGrad[M] = {0, 0, 0};
for (int j = 0; j < M; j++) {
double dotProductResult[M];
dotProductOfArrays(w, disErr, dotProductResult, N);
stepGrad[j] = sumOfArrayMultiplication(transedEstSetDrv[j], dotProductResult, M);
}
// Print array to console to confirm values
NSLog(#"%f %f %f", stepGrad[0], stepGrad[1], stepGrad[2]); <-- if this is present algorithm gives different results.
//Continue calculations......
So this is a part of algorithm in C which is inside for loop. The weird part is the NSLog that prints stepGrad array. Depending if i comment the call to the NSLog or not - the algorithm as a whole gives different results.
It would be great if someone gave some debugging suggestions.
Thanks!
UPDATE 1:
Simplified example which has the same issue and gave more code to support the issue.
UPDATE 2:
Removed the length_of_array function and just replaced it with a known number for simplicity.
So i will answer my own question.
Thanks to the comment from #Klas Lindbäck, i fixed the issue which was related to not initializing a C static array in for loop. So i went over all arrays before and after the code that had issue and did a
memset(a_c_array, 0, sizeof(a_c_array));
after declaration of each array. That is now working fine. Thank you for all your help!

Secure usage of Cell_handle in a CGAL Delaunay triangulation after point insertion

I'm planning to write an algorithm that will use CGAL Delaunay triangulation data structure.
Basically I need to insert some point into the triangulation, save reference to some cells, and then make some other insertion.
I'm wondering how can I store reference to cells that are not invalidated after insertion of new points in triangulation?
It's seems to me that Cell_handle is just a pointer to an internal structure, so it's dangerous to store it due to reallocation of internal container. In the other hand I can see no way in Triangulation_3 interface to store an index from a Cell_handle.
typedef CGAL::Exact_predicates_inexact_constructions_kernel K;
typedef CGAL::Triangulation_vertex_base_3<K> Vb;
typedef CGAL::Triangulation_hierarchy_vertex_base_3<Vb> Vbh;
typedef CGAL::Triangulation_data_structure_3<Vbh> Tds;
typedef CGAL::Delaunay_triangulation_3<K,Tds> Dt;
typedef CGAL::Triangulation_hierarchy_3<Dt> Dh;
typedef Dh::Vertex_iterator Vertex_iterator;
typedef Dh::Vertex_handle Vertex_handle;
typedef Dh::Point Point;
int main(){
Dh T;
for(int i = 0; i < 100; ++i)
T.insert(Point(rand()%1000,rand()%1000,rand()%1000));
assert( T.is_valid() );
assert( T.number_of_vertices() == 100 );
assert( T.dimension() == 3 );
typedef Dh::Cell_iterator CellIterator;
std::vector<Dh::Cell_handle> hnd;
CellIterator itEnd = T.finite_cells_end();
for(CellIterator it = T.finite_cells_begin(); it!=itEnd; ++it){
const int dist = std::distance(T.cells_begin(),it);
hnd.push_back(it);
}
const int newP(1000);
for(int i = 0; i < newP; ++i)
T.insert(Point(rand()%1000,rand()%1000,rand()%1000));
int finiteC(0),infiniteC(0);
for(int i = 0; i < hnd.size(); ++i){
const int dist = std::distance(T.cells_begin(),hnd[i]);
if(T.is_infinite(hnd[i]))
++infiniteC;
else
++finiteC;
}
assert( T.is_valid() );
return 0;
}
This code systematically crash but, and this is really strange to me, if I change newP to 10000, this code magically works.
Can someone explain me how to handle this problem?
Since a cell can disappear during insertion of a new point, the handle you have saved
are not guarantee to point on what you expect.
You have a crash because you use the triangulation hierarchy that internally creates and remove cells in the internal container. If you use CGAL::Delaunay_triangulation_3, you will not have the crash.
For your problem, you should store a quadruplet of Vertex_handleS and use the is_cell function (documented here).
Indeed, cells can disappear on insertion. You can also use the find_conflicts() function to find the cells that are going to be deleted by an insertion, so that you can update whatever you maintain related to them.

Lego NXT-RobotC ultrasonic sensor

I am newbie in programming so I need help with my Ultrasonic sensor driven NXT robot.
It is attached to motor(A) and I'd like it to scan the room from robot's centerline to 90° left and 90° right in 30° increments (seven measurements total), store the data to an array and based on largest distance point my robot in the direction that measurement was taken to avoid obsticles.
Is this possible at all? Or is there some better solution?
Any advice or suggestion is more than welcome.
This would work somewhat for avoiding obstacles, as for implementing it, it depends what you are programming the robot in. I only know lejos(java), in which a function for getting the angle to go would be something like:
public static int scanArea(RegulatedMotor motorTop, UltrasonicSensor sonar) {
int theAngle = 0;
int largestDist = 0;
int currentDist;
for (int rotateAngle = -90; rotateAngle <= 90; rotateAngle += 30) {
motorTop.rotateTo(rotateAngle);
currentDist = sonar.getDistance();
if (currentDist > largestDist) {
largestDist = currentDist;
theAngle = rotateAngle;
}
Delay.msDelay(25);
}
motorTop.rotateTo(0);
return theAngle;
}
If you're coding the robot in any code based language that should be fairly easy to convert (assuming you have functions such as rotateTo, otherwise you would have to use relative movements). Otherwise I don't know how easy this would be to do in the graphical programming language that you use originally.
I would suggest attaching the ultrasonic sensor to a 180° servo that is vertical. You can take a measurement at a specific degree by assigning that degree to the servo. Using this:
int largestDistance = 0;
int Angle =0;
for (int servovalue = 0; servovalue <= 255; rotateAngle += 30){
Servo[servo1] = servovalue;
if (SensorValue[Sonar] > largestDist) {
largestDist = SensorValue[Sonar];
Angle = servovalue;
}
}
return Angle;
}
It assumes that servo1 is your vertical servo, but this should work if your programming in RobotC.

Resources