Graphs
 Longest path in DAG (Directed Acyclic Graph): Both greedy and dynamic approach:
1. Find topological sort
2. Init distance from the source to each node to be minus infinite and zero for the source node.
3. In the topological order, start from the source node and for each adjacent node update the maximum path from parent to it:max = max(current dist, 1+parent dist)
 Shortest path in DAG: Same as the longest path, but instead of minus infinite, we init with plus infinite and instead of calculating maximum in step 3, we calculate minimum.
 Cycle in an undirected graph (No selfedges):
1. use UNIONFIND: for each edge, assign a representative node. if two nodes are in the same set, then there is a cycle:
UnionFind: For a graphG(V, E)
, create an array with sizeV=n
then init each cell with 1. This means that as a first step, each node is an independent subset.
Union: for the graph for edge
{01}
, let 1 be representative of 0.
0 1 2
1 1 1  for edge
{12}
, the representative of 1 is 1(deduced from I – parent[1]=1) and representative of 2 is 2(since parent[2]=1). e.g, they are in different subsets. lets select 1 to be representive of this edge
0 1 2
1 1 1  for
20
, 2 is represented by 1 and 0 is represented by 1. then they are in the same subset
Find: the representative of 1 is found by jumping to the parent in the parent[1] until parent[i] is 1. then i is the ultimate representative of 1. for example, the parent of 1 in the next array is 3
0 1 2 3
1 2 3 1
Time complexity: O(ElogV)
2. Use DFS: for edgeuv
, if v is already visited and v is not parent of u (we didn’t reach u from v), then there is a cycle.
Time Complexity: O(V+E)
 for edge
Tricks
n&(n1) == 0
: check if n is power of 2 or 0 numbers between 0 to n, the last digit will be repeated every 10 numbers. the last two digits of each number will be repeated after 10^2 numbers and the last i digits of each number will be repeated every 10^i numbers
 a group of integers all distinct and in a given range: we can represent these numbers with bits in integers. If we have 42 numbers, we need two integers.
 Lagrange’s foursquare theorem:
 2^n = 1<<n
 100001 = 1111 : convert the zeros to 1’s
 ~0 gives a vector of binary 1’s
 in order to reach the middle of the list, use fast iterator which jump two steps each time:
while(fast && fast.next){ fast = fast.next.next slow = slow.next }
 Implement an algorithm to find the nth to last element of a singly linked list: The idea is we want to pointers with n steps between them so we walk those two pointers togather untel the leading one reaches the end. by then, since the distance between them is n, the first pointer will point to the n’th to the last. so we need two pointers
p1
andp2
pointing to the beginning of the list. walkp1
n steps, then walkp1
andp2
untilp1
reaches the end of the list. returnp2

Bit Manipulation, Add two numbers without arithmatic operation:
 if you xor two numbers, it is like adding them without the carrys
 if you
and
two numbers andshift
the result, it will give you the carries.
 Select
m
random numbers from an array of size n with equal probability: the trick here is to select a random index between the indices m to n and then swap the selected index with the first index in the array. the next index we randomize will be after the already selected msized tail and will be replaced with the second element…etc.
Knowledgebased
 Convert decimal fraction to binary: here
 immutable object(unchangeable^{[1]} object) is an object whose state cannot be modified after it is created.
 concurrency vs parallelism: In programming, concurrency is the composition of independently executing processes, while parallelism is the simultaneous execution of (possibly related) computations. Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once”. assume a server that gets many requests from different users, then we need to deal with concurrency. For a request, we need to execute several services in order to supply a response. this is parallelism.
 How to debug segmentation fault: core dump file & gdb; debugfs, dmesg
 Mutual Exclusion:Resources are unsharable, so if a process acquires a resource, other processes have to wait before using it.
 Thrashing: Thrashing usually refers to phenomena in memory and storage. It occurs when OS spends most of the time on handling page faults(paging in or paging out) rather than getting anything done. For example:
 process A has some pages in hard drive and gets a page fault.
 The missing page(not in memory) needs to be swapped into memory.
 Suppose the memory is full, pages belong to process B need to be swapped out.
 Then process B needs to run, so its pages need to be swapped in and the process A’s pages that were just swapped in need to be swapped out
 This could happen indefinitely until the issue is resolved or one of the process finishes its job.
As we can see the performance is terrible, we need to detect the problem first before solve it(or least alleviate it). Kernel usually keep track of the number pagein and pageout, if the number is too high, the physical memory is almost full and CPU utilization is low, the thrashing is happening. There should be a formula to determine thrashing based on number of pagein and pageout in certain period, etc. In userland, there are some tools like vmstat and atop to help us detect thrashing.
When thrashing is detected, different OSes may deal with it in different ways. We could either kill a process or swap a process completely into secondary storage. Additionally, we could also add more RAM to the machine:)
Operating Systems
Processes
when linux loads, it starts 2 processes: swapper and then init. each process has pid – integer 32bit. means there can be 32k processes in total. a process can create a child process using fork()
or create another totally different process using execv()
execv()
will terminate the calling process and start a whole new process
wait(*status)
is called by the father in order to wait for its child process to finish.
 when the shell receives a command, it forks a process which will execute the command and then wait for it to finish, afterward, the shell will move to the next command.
each process has a state, this state can be
 TASK_RUNNING: the task is ready to run
 TASK_INTERRUPTIBLE: task is waiting for something but that can be stopped by sending the process a signal.
 TASK_UNINTERRUPTIBLE: like the previous one but the process waiting can be stopped only by the event that the process is waiting for.
 TASK_STOPPED AND TASK_ZOMBIE
A process has a kernel mode stack which is used in order to switch between user space and kernel space.
A process is created with one thread called the primary thread
Threads
clone(): When a process is created, it is created with one thread called the primary thread. it can be duplicated using the call clone().
the kernel creates threads using the call sys_clone()
and not clone()
.
threads share the same process environment while creating a new process using for fork is obligated with creating a new environment.
In a process, each thread has its own stack and registers.
threads are supported in the kernel level. basically, each thread has a pid as it is considered somehow a process. however, according to the threads standard, for the user, all the threads have the same pid. in the process descreptor fields, the threads pids are available in a variable named thread_group
and a variable named tpid
which contains the pid that represents all the threads.
when a thread creates a process using fork()
, the process has one parent which is that thread. the other threads cannot wait()
on this process.
when running execv()
inside a thread, it creates a new process with a new thread – the primary thread of the new process. the rest of the threads are killed.
 Thread safe in java: declare method/class as
synchronized
 Deadlock: happens because of 4 reasons:
 mutual execution: only one process can use the resource at a given time
 Hold & Wait: processes already holding the resource can request new ones
 No Preemption: processes cannot remove a process already using the resource
 starvation:two or more processes are waiting on different resources
IPC – inter process communication
each process has a table named PDT  Process descriptor table.
this table contains the descriptors of the files that are used by the devices.
The communication methods are:
 Pipes/FIFOs: Used to send data between two processes which have fathersonbrother connection. has two file descriptors, one for read and the other for write. the pipe is not mapped in the filesystem. FIFO is a pipe, however is global and appears in the filesystem. it also can be used by all the processes.
 Signals: asyncronouis call triggered by the kernel or from another process. each signal has a number. a process can decide what to do with the signal. example of sending signal: cpu will send signal if dividing by zero.
kill()
will sendSIGKILL
to process.  Sockets: is used crossmachine communication
Interrupts
The interrupt is a signal handled in the kernel by the CPU. The CPU Checks for interrupts between each two commands it executes. if there is an interrupt waiting, the kernel switches the user stack to the kernel mode stack of the current process in order to handle the interrupt (in linux, interrupts are handled by the kernel).
IDT – Interrupt Descriptor table: a table loaded by BIOS and pointed by a special register named lidt
. This table contains 256 entries mapping each possible interrupt to execution method handler.
There are two types of interrupts
 Hardware interrupts: sent by external Hardware. This type of interrupt is of course asyncronous.
 the hardware sends
IRQn (Interrupt request #n)
to theAPICAdvanced Programmable Interrupt controler
 The
APIC
check if theIF  Interrupt flag
is turned off so it send it to the CPU.  The CPU tells the
APIC
to block this type of interrupt and allow others – in order to ban collisions in the handler of the received interrupt  The interrupt handler code
IRS
receives severalIRQs
that may be the ones who sent the interrupt – as to one usb port there may be several connected devices.
 the hardware sends
 Software Interrupt(Exceptions): This type of interrupt is syncronous as it is generated by the CPU itself. There are two possible reasons for such interrupt: whether an error in the previous execution command such as division by zero, breakpoint or an interrupt generated by the cpu in multicpu machines as a way of communication with other cpu’s.
 Aborts: Dangrous – accessing bad memory
 Traps: like breakpoint
 Faults: like division by zero
Q/A
What is the difference between mutex and semaphore
Both Semaphore and Binary Semaphore does not have ownership of the thread that locked it while mutex does.
This is how lock on mutex(spinlock) is implemented:
the testandset instruction is an instruction used to write 1 (set) to a memory location and return its old value as a single atomic (i.e., noninterruptible) operation. If multiple processes may access the same memory location, and if a process is currently performing a testandset, no other process may begin another testandset until the first process’s testandset is finished.
What’s happened in the Linux System after using ‘ls’ command
strace cmd
tells us all activities behind the command.
Trees
 Any two different tree scans can restore a tree. i.e, Inorder, Preorder can restore a tree
Algorithms
 Dijkstra’s algorithm is an algorithm for finding the shortest paths between nodes in a weighted graph,
 Kruskal’s algorithm is a minimumspanningtree algorithm which finds an edge of the least possible weight that connects any two trees in the forest.[1] It is a greedy algorithm in graph theory as it finds a minimum spanning tree for a connected weighted graph adding increasing cost arcs at each step
 Bellman–Ford algorithm is an algorithm that computes shortest paths from a single source vertex to all of the other vertices in a weighted digraph. It is slower than Dijkstra’s algorithm for the same problem, but more versatile, as it is capable of handling graphs in which some of the edge weights are negative numbers.
Java’s Interview tools
 Use protected fields rarely: protected fields are accessible from subclasses of the class where it is declared, and from any class in the declared package.
 four fundamental OOP concepts: encapsulation, inheritance, polymorphism, and abstraction
 With composition (also called the decorator pattern), you give your new class a private field that references an instance of the existing class. Instance methods in the new class invoke a corresponding method in the existing class, i.e. method forwarding.
 If you use PriorityQueue in java, the generic class must implement the interface
Comparable
<
Tuple
>
which overrides the method:public
int
compareTo
(Tuple that)
 Count bits:
Integer.bitCount(5) = 2
Data Structures
 In Java, Heap is PriorityQueue
 Divide and Conquir paradigm: In divide and conquer approach, the problem in hand, is divided into smaller subproblems and then each problem is solved independently. When we keep on dividing the subproblems into even smaller subproblems, we may eventually reach a stage where no more division is possible. Those “atomic” smallest possible subproblem (fractions) are solved. The solution of all subproblems is finally merged in order to obtain the solution of an original problem.
 Binary Heap: is a binary tree where all levels are filled. it can be implemented as an array. navigate to parent/left/right as follows:
Arr[i/2] Returns the parent node Arr[(2*i)+1] Returns the left child node Arr[(2*i)+2] Returns the right child node
OOP
 Encapsulation: Using public, private and protected with getters and setters.
Sort Problems
 Bucket Sort – Top K Frequent Elements, here
Problems Worth review
Longest Increasing sequence:
Let’s say I am standing on element i in the array arr. if i know for each j<i the LIS, then the LIS for arr[i] would be arr[j]+1 under the constraint arr[j]<arr[i] for each j<i. if we have several j’s which hold the constraint, then we select the one which maximizes the sum. This is a dynamic programming solution. by adopting a bottom’s up approach, we can initialize an array named dp with size arr where dp[i] represents the LIS of the element arr[i]. in the beginning, each cell of dp will be initialized with 1 as each number is LIS of itself. starting with i=1, we scan all the previous cells of dp to find the j that maximizes dp[i]: for each j<i and arr[j]<arr[i], let dp[i]=Max(dp[i],1+dp[j])
The time complexity her is O(n^2). this can be solved with O(nlogn): a nice trick is to use Arrays.searchBinary(arr,start,end,val)
which will return the (i+1) index that the val should be placed in case it wasn’t found – (if the array is not sorted). Link.
sort linkedlist
with O(logn)
and O(1)
space: mergesort bottomup as referenced here
Kth Smallest Element in a Sorted Matrix:
link, Given a n x n matrix where each of the rows and columns are sorted in ascending order, find the kth smallest element in the matrix.Note that it is the kth smallest element in the sorted order, not the kth distinct element.Example:
matrix = [ [ 1, 5, 9], [10, 11, 13], [12, 13, 15] ], k = 8, return 13.
Note:
You may assume k is always valid, 1 ≤ k ≤ n^{2}.
Solution: Use a min heap: place the first row and take k1 times the min element. when an element is out, put in the min heap the next element in the same column
Given sorted matrix NxN, you can find a number with time complexity O(logN)
: divide the matrix to 4 sections, choose the section where the number we are looking for is in the range…and do the same division to this section and so on..
Coins change Problem:
Given infinite numbers of coins, given the denominations, how many ways there is for change?
Using dynamic programming: The variables are the number of coins and the value we are trying to calculate. so change[i][j]
equals the number of ways to accomplish amount j
using first i
coins. we are looking for change[S][x]
.
change[i][j] = change[i1][j]+change[i][jcoins[i]]
Means, that the way to accomplish change[i][j]
is by using the i’th coin and by not using the i’th coin.
We can make it more space efficient by using 1dimention array:
change[i]
equals they number of ways to accomplish value i
.
for each coin, change[i] =change[i] + change[icoin]
Merge two lists: link here, Learn how to write a short solution.
Symmetric tree: the iterative solution got a nice twict. link here.
Palindrome list: link here, trick how to reach the middle of the list and convert it
Shortest Unsorted Continuous Subarray: link here,Given an integer array, you need to find one continuous subarray that if you only sort this subarray in ascending order, then the whole array will be sorted in ascending order, too.You need to find the shortest such subarray and output its length.
Queue Reconstruction by Height: link here
Find the Duplicate Number: here, the trick is two pointers, one fast and one slow. if they meet, then they meet in the loop. we keep the slow pointer rotating in the loop while we start a new pointer from the beginning. they will meet again in the beginning of the loop
Unique paths: here, Dynamic programming. what is the number of the paths that a robot who can go right or down tries to start from the up left corner to the bottom right. the solution is to use a matrix and do the bottom’s up approach: : suppose the number of paths to arrive at a point (i, j)
is denoted as P[i][j]
, it is easily concluded that P[i][j] = P[i  1][j] + P[i][j  1]
.
Longest Substring Without Repeating Characters: Given a string, find the length of the longest substring without repeating characters.
Coin Change: You are given coins of different denominations and a total amount of money amount. Write a function to compute the fewest number of coins that you need to make up that amount. If that amount of money cannot be made up by any combination of the coins, return 1
.
Find loop in a list: This question is solved using the fast and the slow pointer.
The slow pointer enters the cycle after k steps. The fast pointer by then walked 2k
steps. Therefore, assuming the cycle length is m
, then the distance of the slow pointer from the fast pointer is mk
. in order to kill that distance, on each step, the fast pointer will decrease the distance by 1 step as it goes twice faster. Therefore, after mk
steps that the slow pointer do (the fast went 2(mk)
), they will meet.
However, After those mk
steps, the slow pointer will be k
steps before the beginning of the cycle as its length is m
.
We don’t know how much is k
. However, we can do a trick in order to find it:
We send the slow pointer to the beginning and we slow down the fast pointer to go one step at a time. Now we let them run again with one step at a time. After the previously slow pointer moves k
step, it will meet the previously fast pointer at the beginning of the cycle as they meet k steps before the beginning.
Amazon – Annapurna:
Givan a Tx Queue with length 16bytes. when the queue sends the 5th byte(moving from capacity 5 to 4) an interrupt is sent. your API providessend_tx(buffer,size)
. implement a drive api with an interrupt handler. The stack API that the drive can use is push_tx(byte)
. you cannot overflow the queue as the result will not be expected.
Notice: You are allowed to keep buffers until certain capacity is reached.
Solution:
References
sort algorithms: https://www.topcoder.com/community/datascience/datasciencetutorials/sorting/
UML: https://www.visualparadigm.com/guide/umlunifiedmodelinglanguage/umlaggregationvscomposition/