Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to reference String.prototype models in prologue.js #1222

Open
ppflower opened this issue Jan 27, 2023 · 3 comments
Open

Fail to reference String.prototype models in prologue.js #1222

ppflower opened this issue Jan 27, 2023 · 3 comments

Comments

@ppflower
Copy link

ppflower commented Jan 27, 2023

I am trying to do taint analysis on the following javascript code with WALA, but failed.

// taint.js
function source() {
    return "secret:000111"
}

function sink(text) {
    console.log("Sending:", text)
}

function main() {
    var info = source()
    var res = info.replace("secret:", "")
    sink(res)
}

main()

The IR of the function main is as follows:

BB0
BB1
0   v3 = new <JavaScriptLoader,LArray>@0 taint.js [329->645] (line 19) [3=[arguments]]
1   v5 = global:global $$undefined taint.js [568->587] (line 30) [5=[info, $$destructure$rcvr5]]
3   v7 = global:global $$undefined taint.js [592->629] (line 31) [7=[res]]
5   v11 = lexical:source@Ltaint/nodejsModule/moduleSource taint.js [579->585] (line 30)
6   check v11 taint.js [579->585] (line 30)
BB2
7   v13 = global:global __WALA__int3rnal__global taint.js [579->587] (line 30)
8   v9 = invoke v11@8 v13 exception:v14 taint.js [579->587] (line 30) [9=[info, $$destructure$rcvr5]]
BB3
14   v19 = dispatch v18:#replace@14 v9,v20:#secret:,v21:# exception:v22 taint.js [602->629] (line 31) [19=[res]18=[$$destructure$elt5]9=[info, $$destructure$rcvr5]]
BB4
16   v25 = lexical:sink@Ltaint/nodejsModule/moduleSource taint.js [634->638] (line 32)
17   check v25 taint.js [634->638] (line 32)
BB5
18   v26 = global:global __WALA__int3rnal__global taint.js [634->643] (line 32)
19   v23 = invoke v25@19 v26,v19 exception:v27 taint.js [634->643] (line 32) [19=[res]]
BB6

It seems that there is no def instruction of v18(used in instruction 14), which should represent the function replace defined in String.prototype in prologue.js. It leads to an incomplete callgraph(no edges to functions replace and sink), and a broken taint chain. At first I thought it was because WALA cannot decide the return value type of function source. After I changed the source to a literal string like the following, there is still no right reference of the function replace.

function main() {
    var info = "secret:000111"  // source
    var res = info.replace("secret:", "")
    sink(res)
}

Other functions like replace(modeled not directly in prologue.js, but in global prototype objects) have the same problem. I would like to confirm whether or not this is a bug. And if not, I would like to know how I can reference and analyze such functions correctly. Thanks.

@ppflower
Copy link
Author

By the way, I was using method makeCGBuilder defined in class com.ibm.wala.cast.js.nodejs.NodejsCallGraphBuilderUtil when the problem occurred. Now I find that the callgraph built from class com.ibm.wala.cast.js.util.FieldBasedCGUtil can correctly reference those methods modeled in String.prototype. But I need to use functions provided by NodejsRequireTargetSelector.

@msridhar
Copy link
Member

@ppflower thanks for the report. I can confirm this is a bug, and I've added a test case here:

msridhar@fdaed92

It's an issue with string constants; WALA successfully finds the CG edge to replace from main2 but not main. @juliandolby any idea what's going on here and how we track string constants during pointer analysis with CAst JavaScript?

@ppflower
Copy link
Author

@msridhar Thanks for replying. Now I understand what happened. It seems that when WALA handles javascript function dispatch, it relies on the concrete heap object to decide the target method. Here is another problem I ran into about very common callbacks in javascript programs. It is a bit more complex than the previous one. Consider the following piece of code:

wx.request({
    url: 'www.api.com',
    data: {},
    success: function (res) {
        console.log(res.data.key.substring(0, 2))
    },
    fail: function (err) {}
})

wx.request is the network request API of a js framework. The parameter res of callback function success is the result of the request. The only thing we know about res is that it has a property data which represents the response result in the form of json object. I can model the behavior of wx.request in extended-prologue.js:

wx = {
    request(e) {
        var resp = {
            data:{}
        };
        e.success(resp);
        e.fail("")
    }
}

But it is not possible to model all possible properties in data object. So when there's an attempt to access the key property of data in the function success, it gets nothing. And wala won't know the key property is a string, thus failing to find the edge to String.prototype.substring method. Is there any idea on handling this kind of situation?

I assume that when the receiver object variable points to nothing in an invoke statement, WALA might try to predict the type of target and create one(in this case String). The strategy might be decided by user. I'm not very familiar with WALA code, it's just a quick idea :) Thanks a lot for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants