1/21 merge & sort

分为以下几类：

two sum类
partition 类
灌水类

Merge Sorted Array

这题的设定是一个array is big enough to hold another，所以不需要再创建新的array。从后往前比较A和B，把较大的放在A后面。

要注意如果B中可能会剩下一些更小的数，要在结尾加进去。

class Solution {
    public void merge(int[] nums1, int m, int[] nums2, int n) {
        if (nums1 == null || nums2 == null) {
            return;
        }
        int p1 = m - 1, p2 = n - 1;
        int k = m + n - 1;
        while (p1 >= 0 && p2 >= 0) {
            if (nums1[p1] > nums2[p2]) {
                nums1[k--] = nums1[p1--];
            } else {
                nums1[k--] = nums2[p2--];
            }
        }
        while (p2 >= 0) {
            nums1[k--] = nums2[p2--];
        }
    }
}

Merge Two Sorted Arrays

跟上一题的区别是要用一个新的array把merge的结果存起来，不是in-place操作，使用pointer。

public class Solution {
    public int[] mergeSortedArray(int[] A, int[] B) {
        // 
        if (A == null) return B;
        if (B == null) return A;

        int n = A.length, m = B.length;
        int p1 = 0, p2 = 0, i = 0;
        int[] array = new int[n + m];

        while (i < n + m) {
            if (p2 == m) {
                array[i++] = A[p1++];
            } else if (p1 == n) {
                array[i++] = B[p2++];
            } else {
                if (A[p1] > B[p2]) {
                    array[i++] = B[p2++];
                } else {
                    array[i++] = A[p1++];
                }
            }
        }

        return array;
    }
}

用while循环比较两个array，把较小的存在new array。要注意的细节是，使用lazy evaluation的方式在access两个数组数据前先check index有没有超出本身的长度。

Sort Integers II

Merge Sort ✔️
Quick Sort ✔️
复杂度分析 ✔️

排序问题，要按要求和input选择方法。需要clarify的地方，首先能否do in-place。其次是input是否nearly sorted，还是普通乱序的。

-Merge Sort runtime: O(n log(n)) for average and worst case. space: O(n)

public class Solution {

    public void sortIntegers2(int[] A) {
        // write your code here
        if (A == null || A.length == 0) return;

        mergeSort(A, 0, A.length - 1);
    }

    private void mergeSort(int[] arr, int low, int high) {
        if (low < high) {
            // get the mid position
            int mid = low + (high - low) / 2;
            // sort two subarray
            mergeSort(arr, low, mid);
            mergeSort(arr, mid + 1, high);
            // merge
            merge(arr, low, mid, high);
        }
    } 

    private void merge(int[] arr, int low, int mid, int high) {
        // create two temp arrays
        int[] A = new int[mid - low + 1];
        int[] B = new int[high - mid];

        // copy data to temp arrays
        for (int i = 0; i < A.length; i++) {
            A[i] = arr[low + i];
        }
        for (int i = 0; i < B.length; i++) {
            B[i] = arr[mid + 1 + i];
        }
        // merge the temp arrays
        int p1 = 0, p2 = 0, i = low;
        while (p1 < A.length && p2 < B.length) {
            if (A[p1] < B[p2]) {
                arr[i++] = A[p1++];
            } else {
                arr[i++] = B[p2++];
            }
        }

        while (p1 != A.length) {
            arr[i++] = A[p1++];
        }
        while (p2 != B.length) {
            arr[i++] = B[p2++];
        }
    }
}

-Quick Sort runtime: O(n log(n)) average, O(n^2) worst case space: O(log(n))

public class Solution {
    /*
     * @param A: an integer array
     * @return: 
     */
    public void sortIntegers2(int[] A) {
        // write your code here
        if (A == null || A.length == 0) return;

        quickSort(A, 0, A.length - 1);
    }

    private void quickSort(int[] arr, int low, int high) {
        if (low < high) {
            // set mid number as pivot
            int mid = low + (high - low) / 2;
            // move pivot to the end of array
            int pivot = arr[mid];
            swap(arr, mid, high);
            // partition
            int index = partition(arr, low, high, pivot);
            // quickSort two subarray
            quickSort(arr, low, index - 1);
            quickSort(arr, index + 1, high);
        }
    }

    private int partition(int[] arr, int low, int high, int pivot) {
        int left = low;
        int right = high;

        while (left <= right) {
            while (left <= right && arr[left] < pivot) left++;
            while (left <= right && arr[right] >= pivot) right--;

            // swap elements, and move left and right indices
            if (left <= right) {     // attention: not only swap() elements, but also move left and right..
                swap(arr, left, right);
                left++;
                right--;
            }
        }
        swap(arr, high, left);
        return left;
    }

    private void swap(int[] arr, int a, int b) {
        int temp = arr[a];
        arr[a] = arr[b];
        arr[b] = temp;
    }
}

Merge sort和Quick sort都采用的递归，时间复杂度都是O(nlgn)。区别是Quick sort是in-place operation，是global -> local的过程，而Merge sort要创建新的array去存data，local -> global，O(n) extra space。

Quick sort 的worst case是每次不能把问题分两半，input scale decreases too slow。那么时间复杂度会接近O(n^2)。

1/8补充，

Insertion Sort ✔️

in-place，但是较慢，时间复杂度average and worst caseO(n^2)，eg.[5,4,3,2,1]。best case O(n) if nearly sorted

Sort Integers

思路是从数组的第二个开始遍历，每次要先把当前的存入temp，用它和之前的所有data去比较，如果后面比前面的大，就右移，直到找到temp应该插入的新位置。

void insertionSort(int arr[])
{
    int n = arr.length;
    for (int i=1; i<n; ++i)
    {
        int key = arr[i];
        int j = i-1;

        /* Move elements of arr[0..i-1], that are
           greater than key, to one position ahead
           of their current position */
        while (j >= 0 && arr[j] > key)
        {
            arr[j+1] = arr[j];
            j = j-1;
        }
        arr[j+1] = key;
    }
}

Kth Largest Element in an Array

第一眼看这题BF先sort，再返回第k大的element，复杂度是O(nlgn)。但考点并不是写sort，而是基于quick sort的quick selection。注意第二层while循环，大数放左边nums[i] > pivot，小数放右边nums[j] < pivot，都不能有等号。

复杂度分析 ✔️

class Solution {
    public int findKthLargest(int[] nums, int k) {
        if (nums == null || nums.length == 0) return 0;

        return quickSelect(nums, 0, nums.length - 1, k);
    }

    private int quickSelect(int[] nums, int start, int end, int k) {
        // base case
        if (start == end) {
            return nums[start];
        }

        int i = start, j = end;
        int mid = i + (j - i) / 2;
        int pivot = nums[mid];
        while (i <= j) {
            while (i <= j && nums[i] > pivot) i++;
            while (i <= j && nums[j] < pivot) j--;

            if (i <= j) {
                swap(nums, i, j);
                i++;
                j--;
            }
        }

        // start ... j(.)i... end
        // start ... j
        if (start + k - 1 <= j) {
            return quickSelect(nums, start, j, k);
        }
        // i ... end
        if (start + k - 1 >= i) {
            return quickSelect(nums, i, end, k - (i - start));
        }
        // j(.)i
        return nums[j + 1];
    }

    private void swap(int[] nums, int a, int b) {
        int temp = nums[a];
        nums[a] = nums[b];
        nums[b] = temp;
    }
}

使用quick sort，但每次只搜索有解的那一半。通过O(n)的时间把规模为n的问题变成n/2，优化后的时间复杂度是O(n)。

12/08重做时写得太快，少了一种corner case。有可能在最后一次sort(start = end)时才能找到解。

如果把quick select用在求top k largest 的问题里，且在不要求返回的数组有序的情况下，时间复杂度是O(n)，比用heap还快。

related problems: kth smallest number in a matrix; (using queue) <-> search 2D matrix II (start at top-right corner)

Bucket Sort

First missing positives

The key idea is to rearrange the array so that value will be stored in the (value-1)th slot for positive values. Because we only care about the nums[i](nums[i] > 0 && nums[i] <= nums.length) in specific range. Also, the value and index have some mapping. So we can use bucket sort to solve the problem.

e.g. [1,2,3,4, ...]

After you rearrange the array, you scan the array to find the first missing positive number.

class Solution {
    public int firstMissingPositive(int[] nums) {
        if (nums == null || nums.length == 0) return 1;

        // this is kind of bucket sort
        // we only care about the nums[i](nums[i] > 0 && nums[i] <= nums.length) in specific range. 
        // [0, 1, 2, 3]
        // [1, 2, 3, 4]
        // we need to garantee that nums[i] = i + 1;
        // take an example, 
        // [3, 4, -1, 1]
        // => [-1, 4, 3, 1]
        // => [-1, 1, 3, 4]
        // => [1, -1, 3, 4]
        for (int i = 0; i < nums.length; ++i) {
            while (nums[i] > 0 && nums[i] <= nums.length && nums[nums[i] - 1] != nums[i]) {
                // swap two numbers
                int temp = nums[nums[i] - 1];
                nums[nums[i] - 1] = nums[i];
                nums[i] = temp;
            }
        }

        for (int i = 0; i < nums.length; ++i) {
            if (nums[i] != i + 1) {
                return i + 1; 
            }
        }
        return nums.length + 1;
    }
}

Note, the time complexity is O(n) and space complexity is O(1).

Reverse Pairs

BF可以用两层for循环，依次检查比较大小，时间复杂度O(n^2)。

优化方法是用归并排序，时间复杂度是O(nlgn)，trade off是需额外空间。

具体思路是：和归并排序一样，先需要把数组拆分至最小单元，通过比较相邻两个数字的大小，如果后面比前面的大，逆序对加1。下一层比较的是两个数组，这里和归并不同的是，比较的顺序是从后往前，每次逆序对增加的对数是较小的数的下标，这样就不会漏掉逆序对。比如在比较[1,4] [2,3]两个数组时，4大于3，逆序对的个数累加3之前的所有数(count += 2)，因为4肯定比3之前的所有数都大。再把4放入merge array，比较1和3。

class Solution:
    count = 0

    def reversePairs(self, A):
        if not A or len(A) == 1:
            return 0

        self.mergeSort(A)

        return self.count

    def mergeSort(self, A):
        if not A:
            return []

        if len(A) == 1:
            return A[:]

        mid = len(A) // 2
        left = self.mergeSort(A[:mid])
        right = self.mergeSort(A[mid:])

        return self.merge(left, right)

    def merge(self, left, right):
        m, n = len(left), len(right)
        merged_length = m+n
        merged_array = [None] * merged_length
        i, j, k = m-1, n-1, merged_length - 1

        while k >= 0:
            is_left_exhausted = i < 0
            is_right_exhausted = j < 0

            if not is_left_exhausted and (is_right_exhausted or left[i] > right[j]):
                merged_array[k] = left[i]
                i -= 1
                self.count += j + 1

            else:
                merged_array[k] = right[j]
                j -= 1

            k -= 1

        return merged_array

similar problems: Count of smaller numbers after self, Count of range sum

merge & sort