https://www.freebuf.com/articles/web/283795.html (主要参考)

https://github.com/Firebasky/CodeqlLearn

粗糙的数据库生成

这里生成数据库的方法我用的是这个项目

https://github.com/ice-doom/codeql_compile

根据java源码构建数据库的方法

1	codeql database create "D:\google download\cc_database" --language="java" --source-root="D:\google download\micro_service_seclab-main" --overwrite

这里我用的例子是jar包

1
2
3

python .\codeql_compile.py -a D:\codeql_compile\ezjava.jar  -d D:\codeql_compile\ezjava\BOOT-INF\lib

codeql database create D:\codeql_compile\demo-database --language="java" --source-root=D:\codeql_compile\ezjava.jar_save_1703684740 --command="run.cmd"

一共两步就成功生成了数据库

然后导入数据库

先创建个文件夹然后导入ql文件

qlpack.yml

1
2
3

name: example-query
version: 0.0.0
libraryPathDependencies: codeql-java

然后写ql文件就行了

QL语言编写

基础查询

先是查询数据库中所有的类方法

这里的话是使用Method这个来进行查询

import java

from Method method
select method

查出数据库中所有函数方法

指定查询某个方法

import java

from Method method
where method.hasName("resolveClass")
select method

如果要把其是什么类查询出来的化就再加上一句话

import java

from Method method
where method.hasName("resolveClass")
select method,method.getDeclaringType()

查询父类中子类的某个方法

import java

from Method method
where method.hasName("resolveClass") and method.getDeclaringType().getASupertype().hasQualifiedName("java.io", "ObjectInputStream")
select method, method.getDeclaringType()

这里的话是查询ObjectInputStream这个父类中子类的resolveClass方法这里的话就会有个疑问就是他只能查询到隔一级的子类隔两级的子类的resolveClass方法是查不到的

查询某个类中的某个方法被谁调用

Call和Callable

Callable表示可调用的方法或构造器的集合。

Call表示调用Callable的这个过程（方法调用，构造器调用等等）

过滤方法调用

MethodAccess

一般是先查method，与MethodAccess.getMethod() 进行比较。

import java

from MethodAccess call, Method method
where method.hasName("resolveClass") and method.getDeclaringType().getAnAncestor().hasQualifiedName("java.io", "ObjectInputStream") and call.getMethod() = method
select call

这个是查找resolveClass这个方法被调用情况我这样查是能查到但是只能查父类下面差一级的子类

谓词(可以理解为函数)

和SQL一样，where部分的查询条件如果过长，会显得很乱。CodeQL提供一种机制可以让你把很长的查询语句封装成函数。

这个函数，就叫谓词。

import java
 
predicate isStudent(Method method) {
exists(|method.hasName("getStudent"))
}
 
from Method method
where isStudent(method)
select method.getName(), method.getDeclaringType()

语法解释

predicate 表示当前方法没有返回值。

exists子查询，是CodeQL谓词语法里非常常见的语法结构，它根据内部的子查询返回true or false，来决定筛选出哪些数据。

设置Source和Sink

什么是source和sink

在代码自动化安全审计的理论当中，有一个最核心的三元组概念，就是(source，sink和sanitizer)。

source是指漏洞污染链条的输入点。比如获取http请求的参数部分，就是非常明显的Source。

sink是指漏洞污染链条的执行点，比如SQL注入漏洞，最终执行SQL语句的函数就是sink(这个函数可能叫query或者exeSql，或者其它)。

sanitizer又叫净化函数，是指在整个的漏洞链条当中，如果存在一个方法阻断了整个传递链，那么这个方法就叫sanitizer。

只有当source和sink同时存在，并且从source到sink的链路是通的，才表示当前漏洞是存在的。

设置source

1	override predicate isSource(DataFlow::Node src) {}

我们使用的是Spring Boot框架，那么source就是http参数入口的代码参数，在下面的代码中，source就是username：

@RequestMapping(value = "/one")
public List<Student> one(@RequestParam(value = "username") String username) {
    return indexLogic.getStudent(username);
}

在下面的代码中，source就是Student user(user为Student类型，这个不受影响)。

@PostMapping(value = "/object")
public List<Student> objectParam(@RequestBody Student user) {
    return indexLogic.getStudent(user.getUsername());
}

在下面的代码中，source就是Student user(user为Student类型，这个不受影响)。

@PostMapping(value = "/object")
public List<Student> objectParam(@RequestBody Student user) {
    return indexLogic.getStudent(user.getUsername());
}

Source设置的代码为

1	override predicate isSource(DataFlow::Node src) { src instanceof RemoteFlowSource }

这是SDK自带的规则，里面包含了大多常用的Source入口。我们使用的SpringBoot也包含在其中, 我们可以直接使用。

instanceof是codeql自带的语法

当然了上述语句并不是完整可使用的语句这是把我们等会完全的语句拿出部分进行解释讲解

设置sink

1
2
3

override predicate isSink(DataFlow::Node sink) {

  }

在本案例中，我们的sink应该为query方法(Method)的调用(MethodAccess)，所以我们设置Sink为：

override predicate isSink(DataFlow::Node sink) {
exists(Method method, MethodAccess call |
  method.hasName("query")
  and
  call.getMethod() = method and
  sink.asExpr() = call.getArgument(0)
)
}

在这个语句中 call方法就是我们上文提到的就是可以查询某个方法被谁调用了

注：以上代码使用了exists子查询语法，格式为exists(Obj obj| somthing), 上面查询的意思为：查找一个query()方法的调用点，并把它的第一个参数设置为sink (加黑的这句话就是上述代码中最后一段的解释)

在靶场系统(micro-service-seclab)中，sink就是：

1
2
3

jdbcTemplate.query(sql, ROW_MAPPER);

//提前说明一下

因为我们测试的注入漏洞，当source变量流入这个方法的时候，才会发生注入漏洞！

当然了上述语句并不是完整可使用的语句这是把我们等会完全的语句拿出部分进行解释讲解

Flow数据流

设置好Source和Sink，就相当于搞定了首尾，但是首尾是否能够连通才能决定是否存在漏洞！

一个受污染的变量，能够毫无阻拦的流转到危险函数，就表示存在漏洞！

这个连通工作就是CodeQL引擎本身来完成的。我们通过使用config.hasFlowPath(source, sink)方法来判断是否连通。

比如如下代码：

1
2
3

from VulConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select source.getNode(), source, sink, "source"

我们传递给config.hasFlowPath(source, sink)我们定义好的source和sink，系统就会自动帮我们判断是否存在漏洞了。

Source和sink配合查询结果

在CodeQL中，我们使用官方提供的TaintTracking::Configuration方法定义source和sink，至于中间是否是通的，这个后面使用CodeQL提供的config.hasFlowPath(source, sink)来帮我们处理。

class VulConfig extends TaintTracking::Configuration {
  VulConfig() { this = "SqlInjectionConfig" }

  override predicate isSource(DataFlow::Node src) { src instanceof RemoteFlowSource }

  override predicate isSink(DataFlow::Node sink) {
    exists(Method method, MethodAccess call |
      method.hasName("query")
      and
      call.getMethod() = method and
      sink.asExpr() = call.getArgument(0)
    )
  }
}

CodeQL语法和Java类似，extends代表集成父类TaintTracking::Configuration。

这个类是官方提供用来做数据流分析的通用类，提供很多数据流分析相关的方法，比如isSource(定义source)，isSink(定义sink)

src instanceof RemoteFlowSource 表示src 必须是 RemoteFlowSource类型。在RemoteFlowSource里，官方提供很非常全的source定义，我们本次用到的Springboot的Source就已经涵盖了。

最终demo.ql

/**
 * @id java/examples/vuldemo
 * @name Sql-Injection
 * @description Sql-Injection
 * @kind path-problem
 * @problem.severity warning
 */

import java
import semmle.code.java.dataflow.FlowSources
import semmle.code.java.security.QueryInjection
import DataFlow::PathGraph


class VulConfig extends TaintTracking::Configuration {
  VulConfig() { this = "SqlInjectionConfig" }

  override predicate isSource(DataFlow::Node src) { src instanceof RemoteFlowSource }

  override predicate isSink(DataFlow::Node sink) {
    exists(Method method, MethodAccess call |
      method.hasName("query")
      and
      call.getMethod() = method and
      sink.asExpr() = call.getArgument(0)
    )
  }
}


from VulConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select source.getNode(), source, sink, "source"

注：上面的注释和其它语言是不一样的，不能够删除，它是程序的一部分，因为在我们生成测试报告的时候，上面注释当中的name，description等信息会写入到审计报告中。

(这个不能删如果删了话再执行就会生成不了下图中的alerts 不能清楚的看到source和sink之间的node了)

上述查询误报解决

在上述跑出的链子中跑出sink是这个东西但是呢这个参数是long类型的不可能存在sql注入

这说明我们的规则里，对于List，甚至List类型都会产生误报，source误把这种类型的参数涵盖了。

我们需要采取手段消除这种误报。

这个手段就是isSanitizer。

isSanitizer是CodeQL的类TaintTracking::Configuration提供的净化方法。它的函数原型是：

override predicate isSanitizer(DataFlow::Node node) {}
覆盖谓词 isSanitizer(DataFlow::Node 节点) {}

在CodeQL自带的默认规则里，对当前节点是否为基础类型做了判断。

override predicate isSanitizer(DataFlow::Node node) {
覆盖谓词 isSanitizer(DataFlow::Node 节点) {
node.getType() instanceof PrimitiveType or
node.getType() instanceof BoxedType or
node.getType() instanceof NumberType
}

由于CodeQL检测SQL注入里的isSanitizer方法，只对基础类型做了判断，并没有对这种复合类型做判断，才引起了这次误报问题。

那我们只需要将这种复合类型加入到isSanitizer方法，即可消除这种误报。

override predicate isSanitizer(DataFlow::Node node) {
    node.getType() instanceof PrimitiveType or
    node.getType() instanceof BoxedType or
    node.getType() instanceof NumberType or
    exists(ParameterizedType pt| node.getType() = pt and pt.getTypeArgument(0) instanceof NumberType )
  }

以上代码的意思为：如果当前node节点的类型为基础类型，数字类型和泛型数字类型(比如List)时，就切断数据流，认为数据流断掉了，不会继续往下检测。
重新执行query，我们发现，刚才那条误报已经被成功消除啦。

泛型就是指的是List 这种 ParameterizedType这个指的就是泛型 pt.getTypeArgument(0) instanceof NumberType 这个指的就是泛型的第一个参数是不是Num类型

这里执行的话是会返回True的因为Node节点是会被匹配到的

漏报解决

这个结果的返回时不全的有些链子没有给我们返回来例如下例语句

public List<Student> getStudentWithOptional(Optional<String> username) {
        String sqlWithOptional = "select * from students where username like '%" + username.get() + "%'";
        //String sql = "select * from students where username like ?";
        return jdbcTemplate.query(sqlWithOptional, ROW_MAPPER);
    }

这里修复方法就是强制给其接上

isAdditionalTaintStep方法是CodeQL的类TaintTracking::Configuration提供的的方法，它的原型是：

override predicate isAdditionalTaintStep(DataFlow::Node node1, DataFlow::Node node2) {}
覆盖谓词 isAdditionalTaintStep(DataFlow::Node 节点1, DataFlow::Node 节点2) {}

它的作用是将一个可控节点
A强制传递给另外一个节点B，那么节点B也就成了可控节点。

多次测试之后，我认定是因为username.get()这一步断掉了。大概是因为Optional这种类型的使用没有在CodeQL的语法库里。

那么这里我们强制让username流转到username.get()，这样username.get()就变得可控了。这样应该就能识别出这个注入漏洞了。

/**
 * @id java/examples/vuldemo
 * @name Sql-Injection
 * @description Sql-Injection
 * @kind path-problem
 * @problem.severity warning
 */

import java
import semmle.code.java.dataflow.FlowSources
import semmle.code.java.security.QueryInjection
import DataFlow::PathGraph

predicate isTaintedString(Expr expSrc, Expr expDest) {
    exists(Method method, MethodAccess call, MethodAccess call1 | expSrc = call1.getArgument(0) and expDest=call and call.getMethod() = method and method.hasName("get") and method.getDeclaringType().toString() = "Optional<String>" and call1.getArgument(0).getType().toString() = "Optional<String>"  )
}

class VulConfig extends TaintTracking::Configuration {
  VulConfig() { this = "SqlInjectionConfig" }

  override predicate isSource(DataFlow::Node src) { src instanceof RemoteFlowSource }

  override predicate isSanitizer(DataFlow::Node node) {
    node.getType() instanceof PrimitiveType or
    node.getType() instanceof BoxedType or
    node.getType() instanceof NumberType or
    exists(ParameterizedType pt| node.getType() = pt and pt.getTypeArgument(0) instanceof NumberType )
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(Method method, MethodAccess call |
      method.hasName("query")
      and
      call.getMethod() = method and
      sink.asExpr() = call.getArgument(0)
    )
  }
override predicate isAdditionalTaintStep(DataFlow::Node node1, DataFlow::Node node2) {
    isTaintedString(node1.asExpr(), node2.asExpr())
  }
}


from VulConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select source.getNode(), source, sink, "source"

注：以上我们实现了一个isTaintedString谓词，并使用exists子查询的方式实现了强制把Optional<String> username关联Optional<String> username.get()。
最终，我们的这个注入被跑了出来。

其实这个node跟下断点调试一样一直跟着这个username来走很详细

CodeQL进阶查询

递归问题

递归调用可以帮助我们解决一类问题：就是我们不确定这个方法我们需要调用多少次才能得到我们的结果，这个时候我们就可以用递归调用。

CodeQL里面的递归调用语法是：在谓词方法的后面跟*或者+，来表示调用0次以上和1次以上（和正则类似），0次会打印自己。
我们举一个例子：

在Java语言里，我们可以使用class嵌套class，多个内嵌class的时候，我们需要知道最外层的class是什么怎么办？
比如如下代码：

public class StudentService {
 
    class innerOne {
        public innerOne(){}
 
        class innerTwo {
            public innerTwo(){}
 
            public String Nihao() {
                return "Nihao";
            }
        }
        public String Hi(){
            return "hello";
        }
    }
 
}

按照非递归的方法

import java
 
from Class classes
where classes.getName().toString() = "innerTwo"
select classes.getEnclosingType().getEnclosingType()   // getEnclosingtype获取作用域

使用递归的方法

我们在调用方法后面加*(从本身开始调用)或者+(从上一级开始调用)，来解决此问题。

(就是如开头所说在谓词方法后面加上就行)

1
2
3

from Class classes
where classes.getName().toString() = "innerTwo"
select classes.getEnclosingType+()   // 获取作用域

我们也可以自己封装方法来递归调用。

import java
 
RefType demo(Class classes) {
    result = classes.getEnclosingType()
}
 
from Class classes
where classes.getName().toString() = "innerTwo"
select demo*(classes)   // 获取作用域

强制类型转换问题

import java
 
from Parameter param
select param, param.getType()

getType()目的就是获取项目中所有的参数的type信息。

以上代码的含义是打印所有方法参数的名称和类型。

如果我们进行强制类型转化呢 (简单的说就是把不符合的类型给过滤掉留下符合的类型)

import java
 
from Parameter param
select param, param.getType().(RefType)

强制转换成RefType，意思就是从前面的结果当中过滤出RefType类型的参数。RefType是什么？引用类型，说白了就是去掉int等基础类型之后的数据。

相比之前确实变少了

import java
 
from Parameter param
select param, param.getType().(IntegralType)

这是保留所有数字型的参数