Coding Experience | Job Searching

本文主要讲解找工过程中的coding practice/interview的一些经验。

一些General的建议:

公司通过算法题面试主要是希望考察大家的程序实现能力。
Interview过程中：先解释再写（也就是先和面试官确认要怎么解），写完再解释。
有些可以通过helper函数来简洁代码阅读过程的操作是值得尝试的。这样便于面试官理解。
面试不一定会要求你用最优复杂度的算法来解决问题。（尽量做到各种解法都了解）。
想要做到 Bug Free 最重要的是优化你的 Coding Style 技巧:子函数 + 好的命名风格
带名字的算法都不考的，就不要花时间去准备了。
主要练习：medium难度的题。
不要花时间在Greedy算法。面试基本不会考，因为等同于考智力题或者是背诵题。一个面试官想要自己凭空创造出一个面试题是使用贪心算法的，是非常困难的。既然如此，如果面试中被问到了贪心算法，那么一定是一道经典的贪心问题，这类问题，我们可以称之为背诵题。因为大多数同学（除了智商很高，或者有算法竞赛经历的那一批），是不可能在面试的时候想得出解法的。贪心法，他不是“一个算法”，而是“一类算法”的统称，他更多的是一种高屋建瓴的算法思想，而不是具体实施的算法步骤。所以基本的情况就是，你在题目A里用了某个贪心算法解决了这个问题，然后这个题中用到的贪心法，永远也找不到第二个题用类似的方法来解决。
- 当然，面试中也不是说完全不可能碰到贪心算法，只是几率非常的小，你只需要“背诵”如下的一些几个题的贪心解法就了：http://www.jiuzhang.com/qa/2099/

与面试官的沟通技巧:

沟通的基本原则是什么?
- 不要把面试官当作你的 Interviewer，而要当作你的 Co-worker
- 你可以问他索要提示，但是尽可能的不要问太多提示。正如工作中，你可以问你的同事寻求帮助，但是你问太多，问得事无巨细人家也很烦。
- 沟通之后再动手。正如工作中，你的同事和你合作的时候，不会喜欢你一声不吭的先按照自己的想法把代码写了。
- 意见不合别吵架，先认可对方的想法。正如工作中，你和同事讨论一个问题的不同解决方案的时候，最好先说，我觉得你的方法挺好的，然后再说，不过我觉得有个问题。
写代码的时候该如何沟通？更好的办法是：
- 首先和面试官进行算法和实现方式上的沟通，从面试官那里得到确认你的方法是OK的，写出来是可以过的。
- 开始写代码时，只对一些可能对方不太看得懂的做解释。如果他正在玩手机没看你，就不用理他赶紧写完。
- 写完之后再一股脑给他解释代码。

如何做笔记？

算法
- 用什么算法
- 为什么用这个算法(哪些条件)
代码实现
- 有什么要注意的地方
- 有什么可优化的地方
时空复杂度分析(主要是time complexity)
相关的题目有哪些做过的(跟哪个题比较像)

DEBUG的基本步骤

重新读一遍程序 + 在程序的若干位置输出一些中间结果。
设置special test case：找到一个非常小非常小的可以让你的程序出错的数据。比如空数组，空串，1-5个数的数组，一个字符的字符串。
定位了出错的部分之后，查看自己的程序该部分的逻辑是否有错。如果无法通过肉眼看出错误的部分，就一步步“模拟执行”程序，找出错误。

代码真的不是写出来就可以过。好的代码质量包括:

Bug Free
好的代码风格(Coding Style)，包括变量名命名规范有意义，合理的使用空格，善用空行。
容易让人读懂的逻辑。要把复杂的事情用简单的方式，别把简单的事情写复杂了。
没有冗余代码
有边界检测和异常处理 (corner case)

几个技巧快速提高代码风格

严格按照要求进行程序缩进 - format要合理
二元运算符(作用两个变量的运算符) 左右两边加空格 (比如：四则运算)
if, for 和括号之间加空格；即使 if / for 语句内部只有一句话，也要加上花括号(这条看情况吧)。
变量名使用有意义的英文名，不要用a,b,c,s1,s2
区分不同的逻辑块，逻辑块之间用空行隔开，简要注释每个部分做的事情
多用 Helper Function 或子函数，不要所有程序都写在一个大函数里

评测何时做好面试的准备：(三个维度)

算法能力
Bug Free能力(每道题的平均提交次数)
题量(250-300)

刷题TimeLine

按照各topic的建议刷题数，刷够
leetcode top interview
各大厂的leetcode题库

Note: 面试前专门去各大网站（比如 glassdor，一亩三分地，等等），搜索近期或以往的面经，这是挺重要的一个准备环节，因为至少可以了解到该公司的出题习惯和风格。

STL常用函数 | 纠错笔记 | 基本知识整理

Memory: 栈区(stack)和堆区(heap)的区别

栈区（stack）：由编译器自动分配释放，存放函数的参数值，局部变量的值等，其操作方式类似于数据结构的栈。
堆区（heap）：一般是由程序员分配释放（比如allocate），若程序员不释放的话，程序结束时可能由OS回收，值得注意的是他与数据结构的堆是两回事，分配方式倒是类似于数据结构的链表。
全局区（static）：也叫静态数据内存空间，存储全局变量和静态变量，全局变量和静态变量的存储是放一块的，初始化的全局变量和静态变量放一块区域，没有初始化的在相邻的另一块区域，程序结束后由系统释放。
文字常量区：常量字符串(string)就是放在这里，程序结束后由系统释放。
程序代码区：存放函数体的二进制代码。

常用C++ Data types

All variables use data-type during declaration to restrict the type of data to be stored. Therefore, we can say that data types are used to tell the variables the type of data it can store. Whenever a variable is defined in C++, the compiler allocates some memory for that variable based on the data-type with which it is declared. Every data type requires different amount of memory. - primitive data type: built-in or predefined data types and can be used directly by the user to declare variables 1. integer 2. character 3. boolean 4. floating point & double floating point 5. valueness or void 6. Wide character - Abstract or user defined data types: defined by user itself. Like, defining a class in C++ or a structure. - Datatype Modifiers（类型修饰）: used with the built-in data types to modify the length of data that a particular data type can hold. Data type modifiers available in C++ are: - signed - unsigned - short - long 1 byte = 8 bits; signed将range缩小一倍。== ==unsigned int: 0 ~ 2^32-1 - 常见数据类型的范围： - char: 0 ~ 2^8 - 1(255) - int: -2147483648(-2^31)~2147483647(2^31 - 1) - short int: -32768(-2^15) ~ 32767(2^15 - 1)

|     Data Type     | sizeof()               |
| :---------------: | ---------------------- |
|       char        | 1 byte = 8bits         |
|        int        | 4 bytes = 32bits       |
|     short int     | 2 bytes = 16bits       |
|     long int      | 8 bytes = 64bits       |
|  signed long int  | 8 bytes (范围正负平摊) |
| unsigned long int | 8bytes (只有正数一边)  |
|       float       | 4 bytes                |
|      double       | 8 bytes                |
|      wchar_t      | 4 bytes                |

null 与空字符串的区别

null 表示空对象: 类似于python里面的s = None; Java里面的String s = null; C++不能直接定义一个空对象，但是可以将一个引用绑定在一个nullptr上： string &p = *static_cast<string *>(nullptr); 以上代码定义了一个空对象，我们不能对这个对象做任何操作，说明我们只是定义了变量，只是在==栈内存==中标记了这个变量的存在，但是并没有实际分配任何任何==堆内存==给这个变量，变量没有指向的地址，是个空对象。
C++:string s; 这个声明中，s不是空对象，是指向实实在在的堆内存的。只是这段内存中没有数据而已，s此时是个空串。我们可以对s做所有字符串的操作。例如取长度、拼接、替换、查找字符等。所以:
- C++里就不用额外考虑为null的情况，一般就额外考虑empty string的情况。
String s1 = ""; means that the empty String is assigned to s1. In this case, s1.length() is the same as "".length(), which will yield 0 as expected.
String s2 = null; means that null or "no value at all" is assigned to s2. So this one, s2.length() is the same as null.length(), which will yield a NullPointerException as you can't call methods on null variables (pointers, sort of) in Java.==

Coding Style

一种减少代码identation的书写形式

当for里面需要套if判断的时候，采用 第二种写法更节省indention，更显得简洁。

// 第一种
for ...
  if it's ok:
    row 1
    row 2
    ...
// 第二种
for ...
  if its' not ok:
    continue
  row 1
  row 2
  ...

for loop | 对于for loop的executing过程的理解：

首先initialize a iterate variable in the range of for loop，然后每次在执行loop内的内容时，都要先check condition（这个check意味着要执行condition这句）是否为true，即满足条件才会往内执行。执行完当前的loop内的内容后，update这个iterate variable，然后再判断是否满足条件，满足条件则继续进行；不满足则在update完后跳出。

//错误写法 - 可能会造成循环一直走不出而出现exceed memory limit error
queue<TreeNode*> q;
for (int i = 0; i < q.size(); i++){
  ... //存在改变q.size()的操作
}

//正确写法
queue<TreeNode*> q;
int n = q.size(); //这一步一定要单独拎出来！(在这种需要分层去traverse的情况下)
for (int i = 0; i < n; i++){
  ... //存在改变q.size()的操作
}

//因为如果直接写到for loop的condition，每次i update后都需要去check condition是否为true，同时就会执行q.size()，而实际上一次iteration内的操作改变了q的size，所以很可能出现=="loop本来应到此跳出时却因为size被enlarge了就继续执行下一次iteration"==的情况。

ASCII表

Capital Letters: A(65)-Z(90)
Lower Letters: a(97)-z(122)

String

如何判断两个字符串是否相等？ - 跟Python类似，C++也可以直接使用==比较字符串是否相等。比如：s1 == s2

如何遍历字符串?

string s = "Hello";
for (int i = 0; i < s.size(); ++i) {
    s[i] ...
}
//或者
for (char c: s) {
    c...
}
//跟上一种写法一样，但是此时改变c的值会同时改变原字符串(因为用了reference)
for (char& c: s) {
   c...
}
//其中s.size() 获取字符串的长度, 使用s[i]可以访问对应位置的字符。c++中的字符串是可变的，可以直接用s[i]=x的方式改变字符串。

string::back() Returns a reference to the last character of the string. This function shall not be called on empty strings.
string::front() Returns a reference to the first character of the string. Unlike member string::begin, which returns an iterator to this same character, this function returns a direct reference.
string::begin() Returns an iterator pointing to the first character of the string.
string::end() Returns an iterator pointing to the past-the-end character of the string. ==The past-the-end character is a theoretical character that would follow the last character in the string. It shall not be dereferenced==.
string::clear() Erases the contents of the string, which becomes an empty string (with a length of 0 characters).
string::compare() Compares the value of the string object (or a substring) to the sequence of characters specified by its arguments. The compared string is the value of the string object or -if the signature used has a pos and a len parameters- the substring that begins at its character in position pos and spans len characters. This string is compared to a comparing string, which is determined by the other arguments passed to the function. It returns 0 when they compare equal; < 0 when ==Either the value of the first character that does not match is lower in the compared string, or all compared characters match but the compared string is shorter==; > 0 when Either the value of the first character that does not match is greater in the compared string, or all compared characters match but the compared string is longer.
1. str1.compare(str2) 直接str1和str2比较
2. str1.compare(6, 5, str2) str1的(6 ~ 10)index的substring和str2比较
3. str1.compare(6,5,str2,4,5) str1的(6 ~ 10)index的substring和str2的(4 ~ 8)比较
string::c_str() Returns a pointer to an array that contains a null-terminated sequence of characters (i.e., a C-string) representing the current value of the string object. This array includes the same sequence of characters that make up the value of the string object plus an additional terminating null-character ('\0') at the end.
string::empty() Returns whether the string is empty (i.e. whether its length is 0).
str.erase() Erases part of the string, reducing its length:
1. str.erase (10,8); erase the substring starting from 10 and length is 8;
2. str.erase (str.begin()+9); erase index为9的char; str.erase(str.end() - 1)删掉str的最后一位char;
3. str.erase (str.begin()+5, str.end()-9); erase the substring from index为5(第6个) to index倒数第9
string::find() Searches the string for the first occurrence of the sequence specified by its arguments. When pos is specified, the search only includes characters at or after position pos, ignoring any possible occurrences that include characters before pos.
1. str1.find(str2) default pos is set to 0;
2. str1.find(str2, pos, n) n is Length of sequence of characters to match.
string::insert() Inserts additional characters into the string right before the character indicated by pos (or p).
1. string& insert (size_t pos, const string& str);
2. insert (size_t pos, const string& str, size_t subpos, size_t sublen)

vector

v.insert(v.begin(), 5)
return an empty vector: return vector<string>(); 或者 return {};

map & set

unordered_map
- unordered_map<int, int> 初始值为0。因为 int 不是类类型，所以会进行零值初始化。这类似于vector<int>在初始化一个数组时，假设没有填充一个特定的值，也是默认为0。
unordered_set
- unordered_multiset 与unordered_set的唯一区别：前者可以allow different elements to have equivalent values.

pair的使用：

// pair::pair example
#include <utility>      // std::pair, std::make_pair
#include <string>       // std::string
#include <iostream>     // std::cout
int main(){
  std::pair <std::string,double> product1;                     // default constructor
  std::pair <std::string,double> product2 ("tomatoes",2.30);   // value init
  std::pair <std::string,double> product3 (product2);          // copy constructor

  product1 = std::make_pair(std::string("lightbulbs"),0.99);   // using make_pair (move)

  product2.first = "shoes"; //first和second分别是pair的两个field
  product2.second = 39.90;
  
  std::cout << "The price of " << product1.first << " is $" << product1.second << '\n';
  std::cout << "The price of " << product2.first << " is $" << product2.second << '\n';
  std::cout << "The price of " << product3.first << " is $" << product3.second << '\n';
  return 0;
}

priority_queue

默认的是top()返回max，compare用的是less；如果compare换成greater，则返回min。
front() : return next element (按照priority的规则排序的)
push(): 插入一个element(要遵循priority的规则顺序)
pop_back(): delete the element returned by front()

template<typename T> void print_queue(T& q) {
    while(!q.empty()) {
        std::cout << q.top() << " ";
        q.pop();
    }
    std::cout << '\n';
}

int main() {
    std::priority_queue<int> q; //top()返回的是max
    for(int n : {1,8,5,6,3,4,0,9,7,2}) q.push(n); 
    print_queue(q);
 
    std::priority_queue<int, std::vector<int>, std::greater<int> > q2; //top()返回的是min
    for(int n : {1,8,5,6,3,4,0,9,7,2}) q2.push(n);
    print_queue(q2);
 
    // Using lambda to compare elements.
    auto cmp = [](int left, int right) { return (left ^ 1) < (right ^ 1);};
    std::priority_queue<int, std::vector<int>, decltype(cmp)> q3(cmp);
    for(int n : {1,8,5,6,3,4,0,9,7,2}) q3.push(n);
    print_queue(q3);
}

常考算法的经典题

整理前几章的题目到对应的typora page

Chapter 0: String相关

Valid palindrome - 有不同类型的char混杂
Valid palindrome II - 可以删掉一个
Longest Palindrome - 只是计算最长的长度 - 用hashset/hashvector先存后删的方式来count
Longest Palindromic Substring - expandAroundCenter + 超过max就记录max
Subsets II - 基于dfs的backtracking + 限制条件以避免重复解

Chapter 1: BFS

主要应用分为：二叉树(tree)上的 BFS; 图(graph)上的 BFS; 矩阵上的 BFS; 拓扑排序 Topological Sorting。 - Binary Tree Level Order Traversal - 利用BFS在traverse tree的时候的分层特性 - Serialize and Deserialize Binary Tree - serialize和deserialize都可以用BFS来实现，且是逆过程。==DFS(待implement)== - Clone Graph - 用bfs/dfs都行，关键在于是要借助于新老nodes之间的mapping来是实现deep copy - Word Ladder - 用bfs，关键在于要搞清楚在哪记录已经visit过的node(string) - Number of islands - bfs或者dfs，建议用bfs，因为dfs可能会很深导致stack overflow；==Union Find并查集(待implement)== - Course Schedule - 基于bfs或者dfs的topological sorting；==dfs待review== - Knight Shortest Path - simple graph里基于bfs - 双向BFS优化的解法 - Topological Sorting - general的拓扑排序求解合适的order - Sequence Reconstruction - 判断是不是有且只有一个合适的topological order

遗留问题：

Tological Sorting based on DFS
如何自己实现一个队列（queue）？分别用vector和LinkedList
用bfs实现：非递归的方式找所有方案
如果问最长路径，则要用Dynamic Programming。分层：走到第i层一定会经过第i - 1层。不可以分层：DFS。
Union Find
一道题：Alien Directionary
ppt上的related problems
随课教程的补充内容自学：另外两种宽度优先搜索算法。Bidirectional BFS。

Chapter 3: Binary Search

Classical Binary search - 在sorted array里找到any target position
First position of target - 重点移right
First Bad Version - 和上一题同样的思路
Last position of target - 重点移left
Search in a big sorted array - 用exponential backoff去找到合适的范围
Find k closest elements - binary search定范围，然后背向two pointers按order append
Find Minimum in Rotated Sorted Array - "断崖式Array" - 转换为find the first number smaller than target(last number)
Search in Rotated Array - "断崖式Array" 思路1：基于上一题的思路，先确定division的位置，再确定target应该在的那一半range；然后用binary search；思路2：直接binary search，但是在移left和right的时候需要分在左半range和右半range两种情况的讨论。
Maximum Number in Mountain Sequence - "Mountain Array" 用 '增' 或者 '减' 作为判断条件的binary search
Find (any) peak element - "Mountain Array" 和上题同样的思路
Fast Power / Pow(x, n)- "快速幂算法" - 借用bianry search的思想

Chapter 4: Binary Tree

主要是Traversal和Divide & Conquer的解题思想。 - Minimum Subtree - 累赘地分步走(多写helper function咯); ResultType包装return值; Private fields / reference（需要这种全局性的比较）; - Binary Tree Paths - Divide & conquer; traverse (NOT USE reference to avoid backtrack actively); - Lowest Common Ancestor - 多几个helper function做了重复遍历；one-time traverse but use divide & conquer; - Lowest Common Ancestor III - one-time traverse & use divide & conquer & RestultType; - Flattern Binary Tree to Linked List -

Chapter5: Two Pointers

Chapter 6: Implicit Graph DFS

Subsets - 是否需要去重
Combination Sum - 相当于是subsets + sum限制剪枝 - 每个数是否可以重复使用 - 是否避重
k Sum II - 相当于combination sum + k 限制剪枝
Permutation类问题 - 两种解法：dfs backtrack + record visited status; non-recursion的get_next_permutation（需要记忆）
N-Queens 八皇后问题 - 记录一个对应于board size的矩阵，来记录各个node的访问状态，在recur之前根据the previous queen position来update整个status board。
Letter Combinations of a Phone Number - 很典型的dfs问题（只不过implicit）
找all possible solutions之类的问题
- Split String：典型的dfs，search的每一步有多个选择，这也是dfs的意义：对于每一步来说，每一个选择都试一下，后面找不通了，回来试下一个选择。
- Word Pattern II：pattern的char和str的substr对应。 -
- Word Ladder II：既需要找到all solutions，而且要求shortest。- bfs限定shortest condition + dfs搜寻all possible solutions.

遗留问题：

涉及到trie结构的解法，解决hashmap可以解决的问题
教程

Chapter7: Hash/Heap

LRU cache - doubly linked list + hashmap (num to node)
Insert Delete GetRandom O(1) - vector + hashmap (num to index)
First unique number in datastream - doubly linked list + hashmap (num to node) + hashmap (num to freq) to remove duplicates / hashset (recording duplicates -> then we can choose not to add the number if already exists in duplicates hashset)
First unique char in string - 和上一题完全一样的做法，只不过换成了char和string。
Heapify - siftDown for every element of the array
Ugly Number II - minHeap(always use the min number to generate next batch of ugly numbers) + hashset(recording visited number)
- K Closest Points - maxHeap / HeapSort
- Top K Largest Numbers - minHeap / HeapSort / QuickSort
- Top K Largest Numbers - minHeap
- Merger K Sorted Lists - quick sort all vals + recreating all nodes / recreating all nodes in the order of 打擂台 / minHeap / mergeSort
- Moving Average from Data Stream - queue with limited maxsize to store data
- Implement stack by Two Queues
- Implement queue by Two Stacks

遗留问题

LRU cache - singly linked list version
Load balancer
Insert Delete GetRandom O(1) - with duplicates
First number in data stream II - with duplicates
Moving Average from Data Stream - 前缀解法 / 滚动窗口解法

Chapter8: Memoization Search & DP

Triangle
Wildcard Matching - '*' can be any char sequence.
Regular Expression Matching - '*' can repeat the preceding char zero or more times.
Word Pattern II - 只是为了说明不可以用memoization的情况
Word Break (可行性) - pure DFS / DFS + memoization + maxLen prunning / DP
Word Break II (all possible solutions) - DFS + memoization / DFS + dp-prunning
Word Break III (方案总数) DFS + memoization / pure DP
Palindrome Partitioning - 和Word Break II非常相似，前者是check word是否在dict里，而这题是check word是否是palindrome.
Stone Game - 区间型DP

遗留问题：

Regular Expression Matching需要复习
区间型的Stone Game需要复习
背包型
序列型的Decode ways 和双序列型的 Longest Common Subsequence & Edit Distance

一些General的建议:

与面试官的沟通技巧:

如何做笔记？

DEBUG的基本步骤

代码真的不是写出来就可以过。 好的代码质量包括:

几个技巧快速提高代码风格

评测何时做好面试的准备：(三个维度)

刷题TimeLine

STL常用函数 | 纠错笔记 | 基本知识整理

Memory: 栈区(stack)和堆区(heap)的区别

常用C++ Data types

null 与 空字符串的区别

Coding Style

ASCII表

String

vector

map & set

priority_queue

常考算法的经典题

Chapter 0: String相关

Chapter 1: BFS

Chapter 3: Binary Search

Chapter 4: Binary Tree

Chapter5: Two Pointers

Chapter 6: Implicit Graph DFS

Chapter7: Hash/Heap

Chapter8: Memoization Search & DP

代码真的不是写出来就可以过。好的代码质量包括:

null 与空字符串的区别