JavaScript in V8

the pic can not show up but if you really wanna see you can use the pic link, that should be fine.I am a fresh guy in this area and this is my first blog so the logic may be kind of confusing,sorry about that.

JavaScript Memory Layout

JavaScriptis a dynamic language,which means its property can be added or deleted dynamically,most of the JavaScript interpreter use a hash func to calculate the memory address to store the object struct.Obviously it can waste a lot of computional resources thus that’s why js is slow.

hidden class

In order to optimize,V8 add the hidden class which is quite simple to understand.When a class is defined ,we store a pointer point to the hidden class(the pointer is the Map),everytime we add the new property,it would create a new hidden class,this time we use the hidden class and the offset to get the address, that’s quite similar to the Java runtime and cpp vtable.

This would lead to another question——> How about two instances with the same properties but add in different order

1
2
3
4
5
6
7
8
9
class Test{}
var obj1 = new Test();
var obj2 = new Test();

obj1.x=1;
obj1.y=2;

obj2.y=2;
obj2.x=1;

Why would the two objects have different hidden class?

inline cache

The reason here is because of the cache.When we access the same func or property a lot of times,it’s really a waste of time to calculate the address every time,we will just use the address stored in the pointer with the offset instead.So if obj1 and obj2 has the same struct,we can’t get the right information we want.It’s similar to the cached_bucket in OC if you have once wrote the ROP by Objective-C.

Actually,those reasons could be quite simple if we are good at drawing inferences about other cases from one instance.

Procedure in V8

Above is the outline of how V8 engine is working,from the source code to AST,and then TurboFan produce the high speed machine code,we can see that through the runtime function.

1
2
3
4
5
6
7
8
9
10
function f(o){
truevar obj = [1,2,3];
truevar x = Math.ceil(Math.random());
truereturn obj[x+o];
}
%DisassembleFunction(f);
for (let i = 0 ; i < 0x10000 ; i++){
truef(i);
}
%DisassembleFunction(f);

use v8 --allow-natives-syntax xx.js to run the code above,you can find the changes clearly:

before

after

Besides,v8 provides a large quantity of runtime functions, the directory is at src/runtime,or you can just grep.

Stable mode and Dictionary mode

we can find something weird through a benchmark:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
function createObjects() {
return [
{x: 1, y: 2, z: 3},
{a: 1, b: 2, c: 3}
];
}
function test(obj) {
var sum = 0;
for (var i = 0; i < 100; i++) {
sum += obj.a + obj.c;
}
return sum;
}

//case 1
var pair = createObjects();
delete pair[0].y;
test(pair[1]);

//case2
var pair = createObjects();
delete pair[1].b;
test(pair[1]);

Generally,the latter is almost three or four times faster than the former.To explain that, we should know about the two modes to access the JavaScript object in V8:

  1. Dictionary Mode

the slow one use the hash table to store the properties of the object so that we can also call it the hash mode

  1. Stable Mode

the fast one use C struct and the offset just like the vtable in cpp.

When we first create an object,it’s the fast mode.Only if we delete the property which is not added the last or add too much element dynamically,it will degenerate to the slow mode.

I test that by lldb,using v8-debug-7.1.xx:

it’s quite clear that when we delete the origin property “x”,its mode degenerate to the dictionary mode.

Speculation Guards

At first,we will do the speculative compilation,that’s quite simple so that’s focus on the later stage—>the guards

The compiler has some ways to verify its speculations,that’s just with a short piece of machine code:

1
2
3
4
5
6
7
; Ensure is Smi
test rdi, 0x1
jnz bailout

; Ensure has expected Map
cmp QWORD PTR [rdi-0x1], 0x12345601
jne bailout

i have’t talked about how v8 verifies the type, you can search about the tagged pointer, maybe i will add that part later in my blog.OK,that’s move on, if the check fails ,it will perform a bailout to interpreter and likely to
discard of the compiled code.In that case, the function would be re-compiled to perform a polymorphic property load

bypass constantfold in exp

the thought to bypass constant fold is quite interesting, i think that means hide your information.

Firstly, constant fold is normal in the optimizations of compiler,for instance:

1
2
3
let arr = [1,2,3,4]
let idx = 4
return arr[idx];

If you know a little about assembly,you will know that the code above is likely to use some registers,add operator,etc every time we enter the block.In order to be faster,the compiler will translate the code to(But actually if our index is valid , the fold won’t happen,and i don’t know why for now):

1
2
//and after LoadElimination the LoadElement node will disappear because of the fold
return arr[4];

It’s quite clear now so that’s get into the bypass way,maybe someone is confused about what does hide your information mean,if we make some operations like below:

1
2
3
4
5
6
let arr = [1,2,3,4]
let idx = 4
//the range of & is (0,min(a,b))
idx &= 0xfff;
return arr[idx];
//Load Element node still exist means no fold happens

At this time the analyzer can not determine the range of checkBound(from (4,4)to (0,4)),and we also have the better choice—>escape analysis:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
//case 1 Off-by-One 强网杯
arr = [1]
function foo() {
var o = {x: 1, y:2};
return arr[o.x]
}

//case 2 oob 35C3 CTF
function foo(x) {
let a = [0.1, 0.2, 0.3, 0.4];
let o = {mz: -0};
let b = Object.is(Math.expm1(x), o.mz);
return a[b * 1337];
}

If we just return the a[index] straight forward, we would just get the value undefined because of constant fold.Using an object’s property to get the index value, the analyzer sees o.mz as an access to a field through an object reference: it doesn’t know anything about its type because it can’t make assumptions on o.

Links

zhihu

jit exploition