The sole purpose of this project is to aid me in leaning Google's V8 JavaScript engine.
- Introduction
- Address
- TaggedImpl
- Object
- Handle
- FunctionTemplate
- ObjectTemplate
- Small Integers
- String types
- Roots
- Builtins
- Compiler pipeline
- CodeStubAssembler
- Torque
- WebAssembly
- Promises
- V8 Build artifacts
- V8 Startup walkthrough
- Building V8
- Contributing a change
- Debugging
- Building chromium
- Goma chromium
V8 is bascially consists of the memory management of the heap and the execution stack (very simplified but helps make my point). Things like the callback queue, the event loop and other things like the WebAPIs (DOM, ajax, setTimeout etc) are found inside Chrome or in the case of Node the APIs are Node.js APIs:
+------------------------------------------------------------------------------------------+
| Google Chrome |
| |
| +----------------------------------------+ +------------------------------+ |
| | Google V8 | | WebAPIs | |
| | +-------------+ +---------------+ | | | |
| | | Heap | | Stack | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | +-------------+ +---------------+ | | | |
| | | | | |
| +----------------------------------------+ +------------------------------+ |
| |
| |
| +---------------------+ +---------------------------------------+ |
| | Event loop | | Callback queue | |
| | | | | |
| +---------------------+ +---------------------------------------+ |
| |
| |
+------------------------------------------------------------------------------------------+
The execution stack is a stack of frame pointers. For each function called that function will be pushed onto the stack. When that function returns it will be removed. If that function calls other functions they will be pushed onto the stack. When they have all returned execution can proceed from the returned to point. If one of the functions performs an operation that takes time progress will not be made until it completes as the only way to complete is that the function returns and is popped off the stack. This is what happens when you have a single threaded programming language.
So that describes synchronous functions, what about asynchronous functions?
Lets take for example that you call setTimeout, the setTimeout function will be
pushed onto the call stack and executed. This is where the callback queue comes into play and the event loop. The setTimeout function can add functions to the callback queue. This queue will be processed by the event loop when the call stack is empty.
TODO: Add mirco task queue
An Isolate is an independant copy of the V8 runtime which includes its own heap. Two different Isolates can run in parallel and can be seen as entirely different sandboxed instances of a V8 runtime.
To allow separate JavaScript applications to run in the same isolate a context must be specified for each one. This is to avoid them interfering with each other, for example by changing the builtin objects provided.
These allow you to create JavaScript objects without a dedicated constructor. This would be something like:
const obj = {};
This class is declared in include/v8.h and extends Template:
class V8_EXPORT ObjectTemplate : public Template {
...
}
class V8_EXPORT Template : public Data {
...
}
class V8_EXPORT Data {
private:
Data();
};
Template does not have any members/fields, it only declared functions.
We create an instance of ObjectTemplate and we can add properties to it that all instance created using this ObjectTemplate instance will have. This is done by calling Set which is member of the Template class. You specify a Local for the property. Name is a superclass for Symbols and Strings which can be both be used as names for a property.
The implementation for Set
can be found in src/api/api.cc
:
void Template::Set<v8::Local<Name> name, v8::Local<Data> value, v8::PropertyAttribute attribute) {
...
i::ApiNatives::AddDataProperty(isolate, templ, Utils::OpenHandle(*name),
value_obj,
static_cast<i::PropertyAttributes>(attribute));
}
There is an example in objecttemplate_test.cc
Is a template that is used to create functions.
There is an example in functionttemplate_test.cc
An instance of a function template can be created using:
Local<FunctionTemplate> ft = FunctionTemplate::New(isolate_, function_callback, data);
Local<Function> function = ft->GetFunction(context).ToLocalChecked();
And the function can be called using:
MaybeLocal<Value> ret = function->Call(context, recv, 0, nullptr);
Function::Call can be found in src/api/api.cc
:
bool has_pending_exception = false;
auto self = Utils::OpenHandle(this);
i::Handle<i::Object> recv_obj = Utils::OpenHandle(*recv);
i::Handle<i::Object>* args = reinterpret_cast<i::Handle<i::Object>*>(argv);
Local<Value> result;
has_pending_exception = !ToLocal<Value>(
i::Execution::Call(isolate, self, recv_obj, argc, args), &result);
Notice that the result of Call
which is a MaybeHandle<Object>
will be
passed to ToLocal which is defined in api.h
:
template <class T>
inline bool ToLocal(v8::internal::MaybeHandle<v8::internal::Object> maybe,
Local<T>* local) {
v8::internal::Handle<v8::internal::Object> handle;
if (maybe.ToHandle(&handle)) {
*local = Utils::Convert<v8::internal::Object, T>(handle);
return true;
}
return false;
So lets take a look at Execution::Call
which can be found in execution/execution.cc
and it calls:
return Invoke(isolate, InvokeParams::SetUpForCall(isolate, callable, receiver, argc, argv));
SetUpForCall
will return an InvokeParams
. TODO: Take a closer look at InvokeParams.
V8_WARN_UNUSED_RESULT MaybeHandle<Object> Invoke(Isolate* isolate,
const InvokeParams& params) {
Handle<Object> receiver = params.is_construct
? isolate->factory()->the_hole_value()
: params.receiver;
In our case is_construct
is false as we are not using new
and the receiver,
the this
in the function should be set to the receiver that we passed in. After
that we have Builtins::InvokeApiFunction
auto value = Builtins::InvokeApiFunction(
isolate, params.is_construct, function, receiver, params.argc,
params.argv, Handle<HeapObject>::cast(params.new_target));
result = HandleApiCallHelper<false>(isolate, function, new_target,
fun_data, receiver, arguments);
api-arguments-inl.h has
FunctionCallbackArguments::Call(CallHandlerInfo handler) {
...
ExternalCallbackScope call_scope(isolate, FUNCTION_ADDR(f));
FunctionCallbackInfo<v8::Value> info(values_, argv_, argc_);
f(info);
return GetReturnValue<Object>(isolate);
}
The call to f(info) is what invokes the callback, which is just a normal function call.
Back in HandleApiCallHelper
we have:
Handle<Object> result = custom.Call(call_data);
RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION(isolate, Object);
RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION
expands to:
Handle<Object> result = custom.Call(call_data);
do {
Isolate* __isolate__ = (isolate);
((void) 0);
if (__isolate__->has_scheduled_exception()) {
__isolate__->PromoteScheduledException();
return MaybeHandle<Object>();
}
} while (false);
Notice that if there was an exception an empty object is returned.
Later in Invoke
execution.cc
auto value = Builtins::InvokeApiFunction(
isolate, params.is_construct, function, receiver, params.argc,
params.argv, Handle<HeapObject>::cast(params.new_target));
bool has_exception = value.is_null();
if (has_exception) {
if (params.message_handling == Execution::MessageHandling::kReport) {
isolate->ReportPendingMessages();
}
return MaybeHandle<Object>();
} else {
isolate->clear_pending_message();
}
return value;
Looking at this is looks like passing back an empty object will cause an exception to be triggered?
When a function is called how is an exception reported/triggered in the call chain?
Local<FunctionTemplate> ft = FunctionTemplate::New(isolate_, function_callback);
Local<Function> function = ft->GetFunction(context).ToLocalChecked();
Local<Object> recv = Object::New(isolate_);
MaybeLocal<Value> ret = function->Call(context, recv, 0, nullptr);
function->Call
will end up in src/api/api.cc
Function::Call
and will
in turn call v8::internal::Execution::Call
:
has_pending_exception = !ToLocal<Value>(
i::Execution::Call(isolate, self, recv_obj, argc, args), &result);
RETURN_ON_FAILED_EXECUTION(Value);
RETURN_ESCAPED(result);
Notice that the result of Call
which is a MaybeHandle<Object>
will be
passed to ToLocal which is defined in api.h
:
template <class T>
inline bool ToLocal(v8::internal::MaybeHandle<v8::internal::Object> maybe,
Local<T>* local) {
v8::internal::Handle<v8::internal::Object> handle;
if (maybe.ToHandle(&handle)) {
*local = Utils::Convert<v8::internal::Object, T>(handle);
return true;
}
return false;
So lets take a look at Execution::Call
which can be found in execution/execution.cc
and it calls:
return Invoke(isolate, InvokeParams::SetUpForCall(isolate, callable, receiver, argc, argv));
InvokeParams
is a struct in execution.cc which has a few static functions
(one being SetupForCall
) and also the following fields:
Handle<Object> target;
Handle<Object> receiver;
int argc;
Handle<Object>* argv;
Handle<Object> new_target;
MicrotaskQueue* microtask_queue;
Execution::MessageHandling message_handling;
MaybeHandle<Object>* exception_out;
bool is_construct;
Execution::Target execution_target;
bool reschedule_terminate;
SetupUpForNew
will set up defaults in addition to set the values for the
constructor, new_target, argc, and argv.
Related to exception handling is:
params.message_handling = Execution::MessageHandling::kReport;
So with that out of the way lets focos on the Invoke
function in execution.cc
around line 240 at the time of this writing.
Now, if the target which is our case is the Function
is a JSFunction there is
a path which will be true in our case:
Handle<JSFunction> function = Handle<JSFunction>::cast(params.target);
Next SaveAndSwitchContext
is called:
SaveAndSwitchContext save(isolate, function->context());
Which will call isolate->set_context(function_context). Next, we have:
Handle<Object> receiver = params.is_construct
? isolate->factory()->the_hole_value()
: params.receiver;
And in our case params.is_construct
is false:
(gdb) p params.is_construct
$1 = false
So the receiver will just be set to the reciever set on the param which is the
receiver we passed in. Next, we have the call to Builtins::InvokeApiFunction
which can be found in `builtins/builtins-api.cc:
Handle<FunctionTemplateInfo> fun_data = function->IsFunctionTemplateInfo()
? Handle<FunctionTemplateInfo>::cast(function)
: handle(JSFunction::cast(*function).shared().get_api_func_data(),
isolate);
(gdb) p function->IsFunctionTemplateInfo()
$2 = false
So in our case the following will be executed:
: handle(JSFunction::cast(*function).shared().get_api_func_data(),
isolate);
TODO: Look into JSFunction and SharedFunctionInfo.
Address small_argv[kBufferSize];
Address* argv;
const int frame_argc = argc + BuiltinArguments::kNumExtraArgsWithReceiver;
In our case we did not pass any arguments so argc is 0.
(gdb) p kBufferSize
$7 = 32
(gdb) p argc
$8 = 0
(gdb) p BuiltinArguments::kNumExtraArgsWithReceiver
$9 = 5
if (frame_argc <= kBufferSize) {
argv = small_argv;
So in our case we will be using small_argv
which is an Array of Address:es.
Next, this array will be populated:
int cursor = frame_argc - 1;
argv[cursor--] = receiver->ptr();
So we are setting the argv[4] to the receiver. The next argument set is:
argv[BuiltinArguments::kPaddingOffset] = ReadOnlyRoots(isolate).the_hole_value().ptr();
argv[BuiltinArguments::kArgcOffset] = Smi::FromInt(frame_argc).ptr();
argv[BuiltinArguments::kTargetOffset] = function->ptr();
argv[BuiltinArguments::kNewTargetOffset] = new_target->ptr()
RelocatableArguments arguments(isolate, frame_argc, &argv[frame_argc - 1]);
result = HandleApiCallHelper<false>(isolate, function, new_target,
fun_data, receiver, arguments);
So we can see that we are passing the function, new_target, HandleApiCallHelper
Object raw_call_data = fun_data->call_code();
CallHandlerInfo call_data = CallHandlerInfo::cast(raw_call_data);
Object data_obj = call_data.data();
Handle<Object> result = custom.Call(call_data);
Call
will land in src/api/api-arguments-inl.h
:
Handle<Object> FunctionCallbackArguments::Call(CallHandlerInfo handler) {
...
v8::FunctionCallback f = v8::ToCData<v8::FunctionCallback>(handler.callback());
}
This is the function callback in our example which is named function_callback
:
(gdb) p f
$14 = (v8::FunctionCallback) 0x413ec6 <function_callback(v8::FunctionCallbackInfo<v8::Value> const&)>
VMState<EXTERNAL> state(isolate);
ExternalCallbackScope call_scope(isolate, FUNCTION_ADDR(f));
FunctionCallbackInfo<v8::Value> info(values_, argv_, argc_);
f(info);
return GetReturnValue<Object>(isolate);
}
This return will return to:
RETURN_EXCEPTION_IF_SCHEDULED_EXCEPTION(isolate, Object);
Which will be expanded by the preprocessor into:
do {
Isolate* __isolate__ = (isolate);
((void) 0);
if (__isolate__->has_scheduled_exception()) {
__isolate__->PromoteScheduledException();
return MaybeHandle<Object>();
}
} while (false);
In our case there was not exception so the following line will be reached:
if (result.is_null()) {
return isolate->factory()->undefined_value();
}
So, we returned an empty/null value from our function and the leads to the
undefined value to be returned. Just noting this as I'm not sure if it is
important of not but CustomArguments has a destructor that will be called
when the FunctionCallbackArguments
instance goes out of scope:
template <typename T>
CustomArguments<T>::~CustomArguments() {
slot_at(kReturnValueOffset).store(Object(kHandleZapValue));
Back now in Builtins::InvokeApiFunction
:
MaybeHandle<Object> result;
... // HandleApiCallHelper was shown above
if (argv != small_argv) delete[] argv;
return result;
So after this return we will be back in Invoke
in execution.cc:
auto value = Builtins::InvokeApiFunction(
isolate, params.is_construct, function, receiver, params.argc,
params.argv, Handle<HeapObject>::cast(params.new_target));
bool has_exception = value.is_null();
Now, even though are callback returned nothing/null, that was checked for and instead undefined was returned.
After this we will return to Execution::Call
which will just return to
v8::ToLocal which will turn the Handle into a MaybeHandle.
This will then return us to Function::Call:
RETURN_ON_FAILED_EXECUTION(Value);
RETURN_ESCAPED(result);
RETURN_ON_FAILED_EXECUTION
will expand to:
do {
if (has_pending_exception) {
call_depth_scope.Escape();
return MaybeLocal<Value>();
}
} while (false);
And RETURN_ESCAPED:
return handle_scope.Escape(result);;
I wonder why this last line is a macro when it is just one line. Finally we will be back in our test
MaybeLocal<Value> ret = function->Call(context, recv, 0, nullptr);
(gdb) p result.is_null()
$15 = true
When calling a Function one can throw an exception using:
isolate->ThrowException(String::NewFromUtf8(isolate, "some error").ToLocalChecked());
ThrowException
can be found in src/api/api.cc
and what it does is:
ENTER_V8_DO_NOT_USE(isolate);
// If we're passed an empty handle, we throw an undefined exception
// to deal more gracefully with out of memory situations.
if (value.IsEmpty()) {
isolate->ScheduleThrow(i::ReadOnlyRoots(isolate).undefined_value());
} else {
isolate->ScheduleThrow(*Utils::OpenHandle(*value));
}
return v8::Undefined(reinterpret_cast<v8::Isolate*>(isolate));
Lets take a closer look at ScheduleThrow
.
void Isolate::ScheduleThrow(Object exception) {
Throw(exception);
PropagatePendingExceptionToExternalTryCatch();
if (has_pending_exception()) {
thread_local_top()->scheduled_exception_ = pending_exception();
thread_local_top()->external_caught_exception_ = false;
clear_pending_exception();
}
}
Throw
will end by setting the pending_exception to the exception passed in.
Next, PropagatePendingExceptionToExternalTryCatch
will be called. This is
where a TryCatch
handler comes into play. If one had been registered, which as
I'm writing this I did not have one (but will add one to try it out and verify).
The code for this part looks like this:
v8::TryCatch* handler = try_catch_handler();
handler->can_continue_ = true;
handler->has_terminated_ = false;
handler->exception_ = reinterpret_cast<void*>(pending_exception().ptr());
// Propagate to the external try-catch only if we got an actual message.
if (thread_local_top()->pending_message_obj_.IsTheHole(this)) return true;
handler->message_obj_ = reinterpret_cast<void*>(thread_local_top()->pending_message_obj_.ptr());
When a TryCatch is created its constructor will call RegisterTryCatchHandler
which will set the thread_local_top try_catch_handler which is retrieved above.
Prior to this there will be a call to IsJavaScriptHandlerOnTop
:
// For uncatchable exceptions, the JavaScript handler cannot be on top.
if (!is_catchable_by_javascript(exception)) return false;
return exception != ReadOnlyRoots(heap()).termination_exception();
}
I really need to understand this better and the various ways to catch/handle exceptions (from C++ and JavaScript). Next (in PropagatePendingExceptionToExternalTryCatch) we have:
// Get the top-most JS_ENTRY handler, cannot be on top if it doesn't exist.
Address entry_handler = Isolate::handler(thread_local_top());
if (entry_handler == kNullAddress) return false;
Next, he have the following:
if (!IsExternalHandlerOnTop(exception)) {
thread_local_top()->external_caught_exception_ = false;
return true;
}
I'm really confused at the moment with these different handler, we have one for.
Now, for a javascript function that is executed using Run
what would be
used in execution.cc Execution::Call would be:
Handle<Code> code = JSEntry(isolate, params.execution_target, params.is_construct);
Handle<Code> JSEntry(Isolate* isolate, Execution::Target execution_target, bool is_construct) {
if (is_construct) {
return BUILTIN_CODE(isolate, JSConstructEntry);
} else if (execution_target == Execution::Target::kCallable) {
return BUILTIN_CODE(isolate, JSEntry);
isolate->builtins()->builtin_handle(Builtins::kJSEntry)
} else if (execution_target == Execution::Target::kRunMicrotasks) {
return BUILTIN_CODE(isolate, JSRunMicrotasksEntry);
}
UNREACHABLE();
}
if (params.execution_target == Execution::Target::kCallable) {
// clang-format off
// {new_target}, {target}, {receiver}, return value: tagged pointers
// {argv}: pointer to array of tagged pointers
using JSEntryFunction = GeneratedCode<Address(
Address root_register_value, Address new_target, Address target,
Address receiver, intptr_t argc, Address** argv)>;
JSEntryFunction stub_entry =
JSEntryFunction::FromAddress(isolate, code->InstructionStart());
Address orig_func = params.new_target->ptr();
Address func = params.target->ptr();
Address recv = params.receiver->ptr();
Address** argv = reinterpret_cast<Address**>(params.argv);
RuntimeCallTimerScope timer(isolate, RuntimeCallCounterId::kJS_Execution);
value = Object(stub_entry.Call(isolate->isolate_data()->isolate_root(),
orig_func, func, recv, params.argc, argv));
Address
can be found in include/v8-internal.h
:
typedef uintptr_t Address;
uintptr_t
is an optional type specified in cstdint and is capable of storing
a data pointer. It is an unsigned integer type that any valid pointer to void
can be converted to this type (and back).
This class is declared in `src/objects/tagged-impl.h and has a single private member which is declared as:
public
constexpr StorageType ptr() const { return ptr_; }
private:
StorageType ptr_;
An instance can be created using:
i::TaggedImpl<i::HeapObjectReferenceType::STRONG, i::Address> tagged{};
Storage type can also be Tagged_t
which is defined in globals.h:
using Tagged_t = uint32_t;
It looks like it can be a different value when using pointer compression.
See tagged_test.cc for an example.
This class extends TaggedImpl:
class Object : public TaggedImpl<HeapObjectReferenceType::STRONG, Address> {
An Object can be created using the default constructor, or by passing in an
Address which will delegate to TaggedImpl constructors. Object itself does
not have any members (apart from ptr_
which is inherited from TaggedImpl that is).
So if we create an Object on the stack this is like a pointer/reference to
an object:
+------+
|Object|
|------|
|ptr_ |---->
+------+
Now, ptr_
is a StorageType so it could be a Smi in which case it would just
contains the value directly, for example a small integer:
+------+
|Object|
|------|
| 18 |
+------+
See object_test.cc for an example.
i::Object obj{18};
i::FullObjectSlot slot{&obj};
+----------+ +---------+
|ObjectSlot| | Object |
|----------| |---------|
| address | ---> | 18 |
+----------+ +---------+
See objectslot_test.cc for an example.
A Maybe is like an optional which can either hold a value or nothing.
template <class T>
class Maybe {
public:
V8_INLINE bool IsNothing() const { return !has_value_; }
V8_INLINE bool IsJust() const { return has_value_; }
...
private:
bool has_value_;
T value_;
}
I first thought that name Just
was a little confusing but if you read this
like:
bool cond = true;
Maybe<int> maybe = cond ? Just<int>(10) : Nothing<int>();
I think it makes more sense. There are functions that check if the Maybe is
nothing and crash the process if so. You can also check and return the value
by using FromJust
.
The usage of Maybe is where api calls can fail and returning Nothing is a way of signaling this.
See maybe_test.cc for an example.
template <class T>
class MaybeLocal {
public:
V8_INLINE MaybeLocal() : val_(nullptr) {}
V8_INLINE Local<T> ToLocalChecked();
V8_INLINE bool IsEmpty() const { return val_ == nullptr; }
template <class S>
V8_WARN_UNUSED_RESULT V8_INLINE bool ToLocal(Local<S>* out) const {
out->val_ = IsEmpty() ? nullptr : this->val_;
return !IsEmpty();
}
private:
T* val_;
ToLocalChecked
will crash the process if val_
is a nullptr. If you want to
avoid a crash one can use ToLocal
.
See maybelocal_test.cc for an example.
Is the super class of all objects that can exist the V8 heap:
class V8_EXPORT Data {
private:
Data();
};
Value extends Data and adds a number of methods that check if a Value
is of a certain type, like IsUndefined()
, IsNull
, IsNumber
etc.
It also has useful methods to convert to a Local, for example:
V8_WARN_UNUSED_RESULT MaybeLocal<Number> ToNumber(Local<Context> context) const;
V8_WARN_UNUSED_RESULT MaybeLocal<String> ToNumber(Local<String> context) const;
...
A Handle is similar to a Object and ObjectSlot in that it also contains
an Address member (called location_
and declared in HandleBase
), but with the
difference is that Handles acts as a layer of abstraction and can be relocated
by the garbage collector.
Can be found in src/handles/handles.h
.
class HandleBase {
...
protected:
Address* location_;
}
template <typename T>
class Handle final : public HandleBase {
...
}
+----------+ +--------+ +---------+
| Handle | | Object | | int |
|----------| +-----+ |--------| |---------|
|*location_| ---> |&ptr_| --> | ptr_ | -----> | 5 |
+----------+ +-----+ +--------+ +---------+
(gdb) p handle
$8 = {<v8::internal::HandleBase> = {location_ = 0x7ffdf81d60c0}, <No data fields>}
Notice that location_
contains a pointer:
(gdb) p /x *(int*)0x7ffdf81d60c0
$9 = 0xa9d330
And this is the same as the value in obj:
(gdb) p /x obj.ptr_
$14 = 0xa9d330
And we can access the int using any of the pointers:
(gdb) p /x *value
$16 = 0x5
(gdb) p /x *obj.ptr_
$17 = 0x5
(gdb) p /x *(int*)0x7ffdf81d60c0
$18 = 0xa9d330
(gdb) p /x *(*(int*)0x7ffdf81d60c0)
$19 = 0x5
See handle_test.cc for an example.
A HandleScope only has three members:
internal::Isolate* isolate_;
internal::Address* prev_next_;
internal::Address* prev_limit_;
Lets take a closer look at what happens when we construct a HandleScope:
v8::HandleScope handle_scope{isolate_};
The constructor call will end up in src/api/api.cc
and the constructor simply delegates to
Initialize
:
HandleScope::HandleScope(Isolate* isolate) { Initialize(isolate); }
void HandleScope::Initialize(Isolate* isolate) {
i::Isolate* internal_isolate = reinterpret_cast<i::Isolate*>(isolate);
...
i::HandleScopeData* current = internal_isolate->handle_scope_data();
isolate_ = internal_isolate;
prev_next_ = current->next;
prev_limit_ = current->limit;
current->level++;
}
Every v8::internal::Isolate has member of type HandleScopeData:
HandleScopeData handle_scope_data_;
HandleScopeData* handle_scope_data() { return &handle_scope_data_; }
HandleScopeData is a struct defined in src/handles/handles.h
:
struct HandleScopeData final {
Address* next;
Address* limit;
int level;
int sealed_level;
CanonicalHandleScope* canonical_scope;
void Initialize() {
next = limit = nullptr;
sealed_level = level = 0;
canonical_scope = nullptr;
}
};
Notice that there are two pointers (Address*) to next and a limit. When a HandleScope is Initialized the current handle_scope_data will be retrieved from the internal isolate. The HandleScope instance that is getting created stores the next/limit pointers of the current isolate so that they can be restored when this HandleScope is closed (see CloseScope).
So with a HandleScope created, how does a Local interact with this instance?
HandleScope:CreateHandle will get the handle_scope_data from the isolate:
Address* HandleScope::CreateHandle(Isolate* isolate, Address value) {
HandleScopeData* data = isolate->handle_scope_data();
if (result == data->limit) {
result = Extend(isolate);
}
// Update the current next field, set the value in the created handle,
// and return the result.
data->next = reinterpret_cast<Address*>(reinterpret_cast<Address>(result) + sizeof(Address));
*result = value;
return result;
}
When a Local is created this will/might go through FactoryBase::NewStruct which will allocate a new Map and then create a Handle for the InstanceType being created:
Handle<Struct> str = handle(Struct::cast(result), isolate());
This will land in the constructor Handlesrc/handles/handles-inl.h
template <typename T>
Handle<T>::Handle(T object, Isolate* isolate): HandleBase(object.ptr(), isolate) {}
HandleBase::HandleBase(Address object, Isolate* isolate)
: location_(HandleScope::GetHandle(isolate, object)) {}
Notice that object.ptr()
is used to pass the Address to HandleBase.
And also notice that HandleBase sets its location_ to the result of HandleScope::GetHandle.
Address* HandleScope::GetHandle(Isolate* isolate, Address value) {
DCHECK(AllowHandleAllocation::IsAllowed());
HandleScopeData* data = isolate->handle_scope_data();
CanonicalHandleScope* canonical = data->canonical_scope;
return canonical ? canonical->Lookup(value) : CreateHandle(isolate, value);
}
Which will call CreateHandle in this case and this function will retrieve the current isolate's handle_scope_data:
HandleScopeData* data = isolate->handle_scope_data();
Address* result = data->next;
if (result == data->limit) {
result = Extend(isolate);
}
In this case both next and limit will be 0x0 so Extend will be called. Extend will also get the isolates handle_scope_data and check the current level and after that get the isolates HandleScopeImplementer:
HandleScopeImplementer* impl = isolate->handle_scope_implementer();
HandleScopeImplementer
is declared in src/api/api.h
The destructor for HandleScope will call CloseScope. See handlescope_test.cc for an example.
TODO:
Has a single member val_
which is of type pointer to T
:
template <class T> class Local {
...
private:
T* val_
}
Notice that this is a pointer to T. We could create a local using:
v8::Local<v8::Value> empty_value;
So a Local contains a pointer to type T. We can access this pointer using
operator->
and operator*
.
We can cast from a subtype to a supertype using Local::Cast:
v8::Local<v8::Number> nr = v8::Local<v8::Number>(v8::Number::New(isolate_, 12));
v8::Local<v8::Value> val = v8::Local<v8::Value>::Cast(nr);
And there is also the
v8::Local<v8::Value> val2 = nr.As<v8::Value>();
See local_test.cc for an example.
Using _v8_internal_Print_Object from c++:
$ nm -C libv8_monolith.a | grep Print_Object
0000000000000000 T _v8_internal_Print_Object(void*)
Notice that this function does not have a namespace. We can use this as:
extern void _v8_internal_Print_Object(void* object);
_v8_internal_Print_Object(*((v8::internal::Object**)(*global)));
Lets take a closer look at the above:
v8::internal::Object** gl = ((v8::internal::Object**)(*global));
We use the dereference operator to get the value of a Local (*global), which is
just of type T*
, a pointer to the type the Local. We are then casting that to
be of type pointer-to-pointer to Object.
gl Object* Object
+-----+ +------+ +-------+
| |----->| |----->| |
+-----+ +------+ +-------+
An instance of v8::internal::Object only has a single data member which is a
field named ptr_
of type Address
:
src/objects/objects.h
:
class Object : public TaggedImpl<HeapObjectReferenceType::STRONG, Address> {
public:
constexpr Object() : TaggedImpl(kNullAddress) {}
explicit constexpr Object(Address ptr) : TaggedImpl(ptr) {}
#define IS_TYPE_FUNCTION_DECL(Type) \
V8_INLINE bool Is##Type() const; \
V8_INLINE bool Is##Type(const Isolate* isolate) const;
OBJECT_TYPE_LIST(IS_TYPE_FUNCTION_DECL)
HEAP_OBJECT_TYPE_LIST(IS_TYPE_FUNCTION_DECL)
IS_TYPE_FUNCTION_DECL(HashTableBase)
IS_TYPE_FUNCTION_DECL(SmallOrderedHashTable)
#undef IS_TYPE_FUNCTION_DECL
V8_INLINE bool IsNumber(ReadOnlyRoots roots) const;
}
Lets take a look at one of these functions and see how it is implemented. For example in the OBJECT_TYPE_LIST we have:
#define OBJECT_TYPE_LIST(V) \
V(LayoutDescriptor) \
V(Primitive) \
V(Number) \
V(Numeric)
So the object class will have a function that looks like:
inline bool IsNumber() const;
inline bool IsNumber(const Isolate* isolate) const;
And in src/objects/objects-inl.h we will have the implementations:
bool Object::IsNumber() const {
return IsHeapObject() && HeapObject::cast(*this).IsNumber();
}
IsHeapObject
is defined in TaggedImpl:
constexpr inline bool IsHeapObject() const { return IsStrong(); }
constexpr inline bool IsStrong() const {
#if V8_HAS_CXX14_CONSTEXPR
DCHECK_IMPLIES(!kCanBeWeak, !IsSmi() == HAS_STRONG_HEAP_OBJECT_TAG(ptr_));
#endif
return kCanBeWeak ? HAS_STRONG_HEAP_OBJECT_TAG(ptr_) : !IsSmi();
}
The macro can be found in src/common/globals.h:
#define HAS_STRONG_HEAP_OBJECT_TAG(value) \
(((static_cast<i::Tagged_t>(value) & ::i::kHeapObjectTagMask) == \
::i::kHeapObjectTag))
So we are casting ptr_
which is of type Address into type Tagged_t
which
is defined in src/common/global.h and can be different depending on if compressed
pointers are used or not. If they are not supported it is the same as Address:
using Tagged_t = Address;
src/objects/tagged-impl.h
:
template <HeapObjectReferenceType kRefType, typename StorageType>
class TaggedImpl {
StorageType ptr_;
}
The HeapObjectReferenceType can be either WEAK or STRONG. And the storage type
is Address
in this case. So Object itself only has one member that is inherited
from its only super class and this is ptr_
.
So the following is telling the compiler to treat the value of our Local,
*global
, as a pointer (which it already is) to a pointer that points to
a memory location that confirms to the layout of an v8::internal::Object type,
which we know now has a prt_
member. And we want to dereference it and pass
it into the function.
_v8_internal_Print_Object(*((v8::internal::Object**)(*global)));
But I'm still missing the connection between ObjectTemplate and object. When we create it we use:
Local<ObjectTemplate> global = ObjectTemplate::New(isolate);
In src/api/api.cc
we have:
static Local<ObjectTemplate> ObjectTemplateNew(
i::Isolate* isolate, v8::Local<FunctionTemplate> constructor,
bool do_not_cache) {
i::Handle<i::Struct> struct_obj = isolate->factory()->NewStruct(
i::OBJECT_TEMPLATE_INFO_TYPE, i::AllocationType::kOld);
i::Handle<i::ObjectTemplateInfo> obj = i::Handle<i::ObjectTemplateInfo>::cast(struct_obj);
InitializeTemplate(obj, Consts::OBJECT_TEMPLATE);
int next_serial_number = 0;
if (!constructor.IsEmpty())
obj->set_constructor(*Utils::OpenHandle(*constructor));
obj->set_data(i::Smi::zero());
return Utils::ToLocal(obj);
}
What is a Struct
in this context?
src/objects/struct.h
#include "torque-generated/class-definitions-tq.h"
class Struct : public TorqueGeneratedStruct<Struct, HeapObject> {
public:
inline void InitializeBody(int object_size);
void BriefPrintDetails(std::ostream& os);
TQ_OBJECT_CONSTRUCTORS(Struct)
Notice that the include is specifying torque-generated
include which can be
found out/x64.release_gcc/gen/torque-generated/class-definitions-tq
. So, somewhere
there must be an call to the torque
executable which generates the Code Stub
Assembler C++ headers and sources before compiling the main source files. There is
and there is a section about this in Building V8
.
The macro TQ_OBJECT_CONSTRUCTORS
can be found in src/objects/object-macros.h
and expands to:
constexpr Struct() = default;
protected:
template <typename TFieldType, int kFieldOffset>
friend class TaggedField;
inline explicit Struct(Address ptr);
So what does the TorqueGeneratedStruct look like?
template <class D, class P>
class TorqueGeneratedStruct : public P {
public:
Where D is Struct and P is HeapObject in this case. But the above is the declartion of the type but what we have in the .h file is what was generated.
This type is defined in src/objects/struct.tq
:
@abstract
@generatePrint
@generateCppClass
extern class Struct extends HeapObject {
}
NewStruct
can be found in src/heap/factory-base.cc
template <typename Impl>
HandleFor<Impl, Struct> FactoryBase<Impl>::NewStruct(
InstanceType type, AllocationType allocation) {
Map map = Map::GetStructMap(read_only_roots(), type);
int size = map.instance_size();
HeapObject result = AllocateRawWithImmortalMap(size, allocation, map);
HandleFor<Impl, Struct> str = handle(Struct::cast(result), isolate());
str->InitializeBody(size);
return str;
}
Every object that is stored on the v8 heap has a Map (src/objects/map.h
) that
describes the structure of the object being stored.
class Map : public HeapObject {
1725 return Utils::ToLocal(obj);
(gdb) p obj
$6 = {<v8::internal::HandleBase> = {location_ = 0x30b5160}, <No data fields>}
So this is the connection, what we see as a Local is a HandleBase. TODO: dig into this some more when I have time.
(lldb) expr gl
(v8::internal::Object **) $0 = 0x00000000020ee160
(lldb) memory read -f x -s 8 -c 1 gl
0x020ee160: 0x00000aee081c0121
(lldb) memory read -f x -s 8 -c 1 *gl
0xaee081c0121: 0x0200000002080433
You can reload .lldbinit
using the following command:
(lldb) command source ~/.lldbinit
This can be useful when debugging a lldb command. You can set a breakpoint and break at that location and make updates to the command and reload without having to restart lldb.
Currently, the lldb-commands.py that ships with v8 contains an extra operation
of the parameter pased to ptr_arg_cmd
:
def ptr_arg_cmd(debugger, name, param, cmd):
if not param:
print("'{}' requires an argument".format(name))
return
param = '(void*)({})'.format(param)
no_arg_cmd(debugger, cmd.format(param))
Notice that param
is the object that we want to print, for example lets say
it is a local named obj:
param = "(void*)(obj)"
This will then be "passed"/formatted into the command string:
"_v8_internal_Print_Object(*(v8::internal::Object**)(*(void*)(obj))")
V8 is single threaded (the execution of the functions of the stack) but there are supporting threads used for garbage collection, profiling (IC, and perhaps other things) (I think). Lets see what threads there are:
$ LD_LIBRARY_PATH=../v8_src/v8/out/x64.release_gcc/ lldb ./hello-world
(lldb) br s -n main
(lldb) r
(lldb) thread list
thread #1: tid = 0x2efca6, 0x0000000100001e16 hello-world`main(argc=1, argv=0x00007fff5fbfee98) + 38 at hello-world.cc:40, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
So at startup there is only one thread which is what we expected. Lets skip ahead to where we create the platform:
Platform* platform = platform::CreateDefaultPlatform();
...
DefaultPlatform* platform = new DefaultPlatform(idle_task_support, tracing_controller);
platform->SetThreadPoolSize(thread_pool_size);
(lldb) fr v thread_pool_size
(int) thread_pool_size = 0
Next there is a check for 0 and the number of processors -1 is used as the size of the thread pool:
(lldb) fr v thread_pool_size
(int) thread_pool_size = 7
This is all that SetThreadPoolSize
does. After this we have:
platform->EnsureInitialized();
for (int i = 0; i < thread_pool_size_; ++i)
thread_pool_.push_back(new WorkerThread(&queue_));
new WorkerThread
will create a new pthread (on my system which is MacOSX):
result = pthread_create(&data_->thread_, &attr, ThreadEntry, this);
ThreadEntry can be found in src/base/platform/platform-posix.
International Components for Unicode (ICU) deals with internationalization (i18n). ICU provides support locale-sensitve string comparisons, date/time/number/currency formatting etc.
There is an optional API called ECMAScript 402 which V8 suppports and which is enabled by default. i18n-support says that even if your application does not use ICU you still need to call InitializeICU :
V8::InitializeICU();
JavaScript specifies a lot of built-in functionality which every V8 context must provide. For example, you can run Math.PI and that will work in a JavaScript console/repl. The global object and all the built-in functionality must be setup and initialized into the V8 heap. This can be time consuming and affect runtime performance if this has to be done every time.
Now this is where the file snapshot_blob.bin
comes into play.
But what are this bin file?
The blobs above are prepared snapshots that get directly deserialized into the
heap to provide an initilized context.
There is an executable named mksnapshot
which is defined in
src/snapshot/mksnapshot.c
.
When V8 is built with v8_use_external_startup_data
the build process will
create a snapshot_blob.bin file using a template in BUILD.gn named run_mksnapshot
, but if false it will generate a file named snapshot.cc.