sampsyo/quala

Emitting Type Annotations for global variable

sangeeta0201 opened this issue · 7 comments

Hello Adrian,

I am trying to emit type annotations for global pointer. This is how my C code looks like -

int CHECK_AV *  shd;

int main() {
  shd = (int*)malloc(10 * sizeof(int));
  return 0;
}

This is how my llvm bitcode looks like -

define dso_local i32 @main() local_unnamed_addr #4 {                                               
entry:
  %call = tail call noalias i8* @malloc(i64 40) #3
  store i8* %call, i8** bitcast (i32** @shd to i8**), align 8, !tbaa !2 // I want to add type annotation here
  ret i32 0
}

tbaa annotation is coming from this call CGM.DecorateInstructionWithTBAA(Store, TBAAInfo); from void CodeGenFunction::EmitStoreOfScalar. That's why I have also added CGM.TADecorate(Store, Ty); there.

I am printing this comment in CGTypeAnnotation.cpp -

  std::cout<<"success1\n";
  Inst->setMetadata("tyann", node);

and I am getting success printed out while compilation but tyann is not attached to IR. I am new to clang, so not able to figure out how should I debug.
Amazing thing is if I change my code to

int main() {
  shd = (int*)malloc(10 * sizeof(int));
  shd[1]++;
  return 0;
}

I get my annotations from call from EmitStoreOfScalar

; Function Attrs: norecurse nounwind uwtable
define dso_local i32 @main() local_unnamed_addr #4 {
entry:
  %call = tail call noalias i8* @malloc(i64 40) #3
  store i8* %call, i8** bitcast (i32** @shd to i8**), align 8, !tbaa !2
  %arrayidx = getelementptr inbounds i8, i8* %call, i64 4
  %0 = bitcast i8* %arrayidx to i32*
  %1 = load i32, i32* %0, align 4, !tyann !6
  %inc = add nsw i32 %1, 1
  store i32 %inc, i32* %0, align 4, !tbaa !7, !tyann !6
  ret i32 0
}

If you can help me, it will mean a lot to me.

Interesting! So it sounds like the type qualifier isn’t attached to the type of the Store value at that point in codegen.

My first guess is that this is because the store you’re looking at is managing the pointer to a type-qualified value; not the value itself. The pointer type is not qualified; just the pointed-to type. This is usually a nice distinction to have—for example, in an information flow system, you might have a private pointer to public data.

But if it’s getting in your way, you might consider adding a simple check to detect when the result of a malloc call is later dereferenced to a qualified value with load.

Thank you so much for your response. I am only concerned with type annotated value. It might happen that I pass this pointer to another function and that function reads and writes value pointed by this pointer but in this function, this value won't be annotated. For this reason, I need pointer type to be qualified. How can I do that? Any suggestions?

Well, before diving into that, it might make sense to verify my hypothesis above. For example, take a look at the test input for the included nullability checker:
https://github.com/sampsyo/quala/blob/master/examples/nullness/test/simple.c

It looks like I figured out the right way to annotate the pointer, rather than the int, in a declaration was int * NULLABLE b;, which is the opposite of your example. Maybe the solution is as simple as that reordering?

Program-

int * CHECK_AV shd;

int main() {                                                                                                                      
  shd = (int*)malloc(10 * sizeof(int));
  return 0;
}

when I compile with -O0, I do get annotation

define dso_local i32 @main() #5 {
entry:
  %retval = alloca i32, align 4
  store i32 0, i32* %retval, align 4
  %call = call noalias i8* @malloc(i64 40) #3
  %0 = bitcast i8* %call to i32*
  store i32* %0, i32** @shd, align 8, !tyann !2
  ret i32 0
}

but when I compile with -O1, I don't

define dso_local i32 @main() local_unnamed_addr #4 {
entry:
  %call = tail call noalias i8* @malloc(i64 40) #3
  store i8* %call, i8** bitcast (i32** @shd to i8**), align 8, !tbaa !6
  ret i32 0
}

How can I make sure, that I do get same annotations in O1 as I did in O0?

Great!

Preserving metadata through optimizations actually seems like a really hard problem. I’d recommend you run your pass that consumes annotations early in the pipeline instead.

It does not look like as no one has done it before because if you see tbaa annotations, they are preserved through optimizations. Anyways, I will try to figure it out. Thank you so much for your time. I truly appreciate it.

Great! Yeah, if you’re feeling ambitious, it does seem possible with a good chunk of hacking. Good luck, and I’d be interested to hear how it goes!