coins & 博弈类DP


博弈的先后手

  • State:

    • 先手是否能获胜/能够获得的最大利益(eg. dp[i] 表示还剩i个棋子时,先手的人能否获胜)
  • Function:

    • 循环枚举先手的策略可能性
  • Intialization:

    • 最极限/最小的状态下的先手的值
  • Answer:

    • 整个问题先手是否可能获胜

先思考最小状态,然后思考大的状态 -> 往小的递推,那么非常适合记忆化搜索

coins in a line

note that, we know the state for this problem is dp[i] represent 现在还剩 i 个银币,现在先手取硬币的人最后输赢状况

function: dp[i] = not dp[i - 1] or not dp[i - 2]

Initialization: dp[0] = false, dp[1]= true, dp[2] = true

answer: dp[n]

public boolean firstWillWin(int n) {
    // write your code here
    if (n <= 0) return false;
    if (n <= 2) return true;

    // dp[0] = false; dp[1] = true; dp[2] = true;
    // dp[n] = !(dp[n - 1] || dp[n - 2]);

    boolean[] dp = new boolean[n+1];
    dp[0] = false;
    dp[1] = true;
    dp[2] = true;
    for (int i = 3; i <= n; ++i) {
        dp[i] = !(dp[i - 1] && dp[i - 2]);
    }
    return dp[n];
}

coins in a line 2

in this problem, we need to modify our state and its corresponding function,

public boolean firstWillWin(int[] values) {
    // write your code here

    //    index
    //    0 1 2 3 4 5
    //    5 6 7 2 3 5 
    //sum         sum[1]
    //dp           dp[1] when get coins from n - i to n - 1, maximum amount of money that the first player can get
    //             dp[i] = max(sum[i] - dp[i - 1], sum[i] - dp[i - 2])

    if (values == null || values.length == 0) return false;
    int n = values.length;
    int[] sum = new int[n + 1];
    for (int i = 1; i <= n; ++i) {
        // sum[0] is 0, sum[1] is values[n - 1] ... reverse summation of all the elements
        sum[i] = sum[i - 1] + values[n - i];     
    }

    int[] dp = new int[n + 1];
    dp[1] = values[n - 1];     // with i coins left, the first player can get dp[i] amount of money 
    for (int i = 2; i <= n; ++i) {
        dp[i] = Math.max(sum[i] - dp[i - 1], sum[i] - dp[i - 2]); 
    }

    return dp[n] > sum[n] / 2;
}

or we can use memorization to address the problem

public boolean firstWillWin(int[] values) {
    // write your code here
    int n = values.length;
    int []dp = new int[n + 1];
    boolean []flag =new boolean[n + 1];
    int []sum = new int[n+1];
    int allsum = values[n-1];
    sum[n-1] = values[n-1];
    for(int i = n-2; i >= 0; i--) { 
        sum[i] += sum[i+1] + values[i];
        allsum += values[i];
    }
    return allsum/2 < MemorySearch(0, n, dp, flag, values, sum);
}
int MemorySearch(int i, int n, int []dp, boolean []flag, int []values, int []sum) {
    if(flag[i] == true)
        return dp[i];
    flag[i] = true;
    if(i == n)  {
        dp[n] = 0;  
    } else if(i == n-1) {
        dp[i] = values[i];
    } else if(i == n-2) {
        dp[i] = values[i] + values[i + 1]; 
    } else {
        dp[i] = sum[i] -
            Math.min(MemorySearch(i+1, n, dp, flag, values, sum) , MemorySearch(i+2, n, dp, flag, values, sum));
    }
    return dp[i];
}

coins in a line 3

this is a hard problem, we should use DFS + DP to address this one. First we notice that the solution to the large problem is dependent on the smaller problem, then we find that we cannot find a specific sequence to use Dynamic progrmaming, so we can use search with memorization.

// 5 6 7 8 9 2 4

// i j

to maximize the sum of amount of money the first player can get, we should minimize the sum of money the second player can get.

So, define dp[i][j] = sum[i][j] - min(memorySearch(left+1, right, ...), memorySearch(left, right -1, ...)).

public boolean firstWillWin(int[] values) {
    // write your code here
    if (values == null || values.length == 0) return false;
    int n = values.length;

    boolean[][] flag = new boolean[n + 1][n + 1];
    int[][] dp = new int[n + 1][n + 1];

    int[][] sum = new int[n + 1][n + 1];
    int allSum = 0;
    for (int i = 0; i < n; ++i) {
        for (int j = i; j < n; ++j) {
            sum[i][j] = i == j ? values[j] : sum[i][j - 1] + values[j];
        }
    }
    for (int a : values) {
        allSum += a;
    }

    return allSum < 2 * memorySearch(0, n - 1, flag, dp, sum, values);
}

private int memorySearch(int left, int right, boolean[][] flag, int[][] dp, int[][] sum, int[] values) {
    if (flag[left][right] != false) return dp[left][right];
    flag[left][right] = true;

    if (left > right) {
        dp[left][right] = 0;
    } else if (left == right) {
        dp[left][right] = values[left];
    } else if (left + 1 == right) {
        dp[left][right] = Math.max(values[left], values[right]);
    } else {
        int cur = Math.min(memorySearch(left + 1, right, flag, dp, sum, values), 
            memorySearch(left, right - 1, flag, dp, sum, values));
        dp[left][right] = sum[left][right] - cur;
    }
    return dp[left][right];
}

results matching ""

    No results matching ""