coins & 博弈类DP
博弈的先后手
State:
- 先手是否能获胜/能够获得的最大利益(eg. dp[i] 表示还剩i个棋子时,先手的人能否获胜)
Function:
- 循环枚举先手的策略可能性
Intialization:
- 最极限/最小的状态下的先手的值
Answer:
- 整个问题先手是否可能获胜
先思考最小状态,然后思考大的状态 -> 往小的递推,那么非常适合记忆化搜索
coins in a line
note that, we know the state for this problem is dp[i] represent 现在还剩 i 个银币,现在先手取硬币的人最后输赢状况
function: dp[i] = not dp[i - 1] or not dp[i - 2]
Initialization: dp[0] = false, dp[1]= true, dp[2] = true
answer: dp[n]
public boolean firstWillWin(int n) {
// write your code here
if (n <= 0) return false;
if (n <= 2) return true;
// dp[0] = false; dp[1] = true; dp[2] = true;
// dp[n] = !(dp[n - 1] || dp[n - 2]);
boolean[] dp = new boolean[n+1];
dp[0] = false;
dp[1] = true;
dp[2] = true;
for (int i = 3; i <= n; ++i) {
dp[i] = !(dp[i - 1] && dp[i - 2]);
}
return dp[n];
}
coins in a line 2
in this problem, we need to modify our state and its corresponding function,
public boolean firstWillWin(int[] values) {
// write your code here
// index
// 0 1 2 3 4 5
// 5 6 7 2 3 5
//sum sum[1]
//dp dp[1] when get coins from n - i to n - 1, maximum amount of money that the first player can get
// dp[i] = max(sum[i] - dp[i - 1], sum[i] - dp[i - 2])
if (values == null || values.length == 0) return false;
int n = values.length;
int[] sum = new int[n + 1];
for (int i = 1; i <= n; ++i) {
// sum[0] is 0, sum[1] is values[n - 1] ... reverse summation of all the elements
sum[i] = sum[i - 1] + values[n - i];
}
int[] dp = new int[n + 1];
dp[1] = values[n - 1]; // with i coins left, the first player can get dp[i] amount of money
for (int i = 2; i <= n; ++i) {
dp[i] = Math.max(sum[i] - dp[i - 1], sum[i] - dp[i - 2]);
}
return dp[n] > sum[n] / 2;
}
or we can use memorization to address the problem
public boolean firstWillWin(int[] values) {
// write your code here
int n = values.length;
int []dp = new int[n + 1];
boolean []flag =new boolean[n + 1];
int []sum = new int[n+1];
int allsum = values[n-1];
sum[n-1] = values[n-1];
for(int i = n-2; i >= 0; i--) {
sum[i] += sum[i+1] + values[i];
allsum += values[i];
}
return allsum/2 < MemorySearch(0, n, dp, flag, values, sum);
}
int MemorySearch(int i, int n, int []dp, boolean []flag, int []values, int []sum) {
if(flag[i] == true)
return dp[i];
flag[i] = true;
if(i == n) {
dp[n] = 0;
} else if(i == n-1) {
dp[i] = values[i];
} else if(i == n-2) {
dp[i] = values[i] + values[i + 1];
} else {
dp[i] = sum[i] -
Math.min(MemorySearch(i+1, n, dp, flag, values, sum) , MemorySearch(i+2, n, dp, flag, values, sum));
}
return dp[i];
}
coins in a line 3
this is a hard problem, we should use DFS + DP to address this one. First we notice that the solution to the large problem is dependent on the smaller problem, then we find that we cannot find a specific sequence to use Dynamic progrmaming, so we can use search with memorization.
// 5 6 7 8 9 2 4
// i j
to maximize the sum of amount of money the first player can get, we should minimize the sum of money the second player can get.
So, define dp[i][j] = sum[i][j] - min(memorySearch(left+1, right, ...), memorySearch(left, right -1, ...)).
public boolean firstWillWin(int[] values) {
// write your code here
if (values == null || values.length == 0) return false;
int n = values.length;
boolean[][] flag = new boolean[n + 1][n + 1];
int[][] dp = new int[n + 1][n + 1];
int[][] sum = new int[n + 1][n + 1];
int allSum = 0;
for (int i = 0; i < n; ++i) {
for (int j = i; j < n; ++j) {
sum[i][j] = i == j ? values[j] : sum[i][j - 1] + values[j];
}
}
for (int a : values) {
allSum += a;
}
return allSum < 2 * memorySearch(0, n - 1, flag, dp, sum, values);
}
private int memorySearch(int left, int right, boolean[][] flag, int[][] dp, int[][] sum, int[] values) {
if (flag[left][right] != false) return dp[left][right];
flag[left][right] = true;
if (left > right) {
dp[left][right] = 0;
} else if (left == right) {
dp[left][right] = values[left];
} else if (left + 1 == right) {
dp[left][right] = Math.max(values[left], values[right]);
} else {
int cur = Math.min(memorySearch(left + 1, right, flag, dp, sum, values),
memorySearch(left, right - 1, flag, dp, sum, values));
dp[left][right] = sum[left][right] - cur;
}
return dp[left][right];
}