Statistics Note

  1. Hoeffding’s inequality: set the lower and upper bounds of confidence interval for a bounded random variable. The inequality holds for both large and small sample. Check MIT 18.650 Lecture Note Page 27.

  2. Slutsky’s theorem: replace theoretical mean with empirical mean in deriving the confidence interval. Check MIT 18.650 Lecture Note Page 43.

  3. Quadratic risk of estimator
    • E\left[\left(\hat{\theta_n} - \theta\right)^2\right]E\left[\left(\hat{\theta_n} - E\left[\hat{\theta_n}\right] + E\left[\hat{\theta_n}\right]  -\theta\right)^2\right] = E\left[\left(\hat{\theta_n} - E\left[\hat{\theta_n}\right]\right)^2\right] + \left(E\left[\hat{\theta_n}\right]  -\theta\right)^2 = variance + bias^2
  4. Standard error of OLS:
    • y - X\beta \sim N\left(0, \sigma^2\right)
    • \hat{\sigma} = MSE\left(y - X\hat{\beta}\right) = \sqrt{\sum_i{\left(y_i  - X_i\beta\right)^2}/N}
    • var\left(y\right) = var\left(\epsilon\right) = \hat{\sigma}^2 I
    • var\left(\hat\beta\right) = var\left(\left(X^TX\right)^{-1}X^Ty\right) = \left(X^TX\right)^{-1}X^T var\left(y\right) X\left(X^TX\right)^{-1} = \hat{\sigma}^2\left(X^TX\right)^{-1}
    • The standard error of the coefficient is the square root of the diagonal terms of  var\left(\hat\beta\right).
    • Note: when calculating the standard error of coefficients, don’t forget the intercept \beta_0 and add constant column to X.
  5. Standard error of MLE (Fisher Information)
    • \sqrt{n}\left(\hat{\theta}^{ML}_{n} - \theta\right) \sim \mathcal{N}\left(0, \mathcal{I}^{-1}\left(\theta\right)\right)
    • var\left(\hat{\theta}^{ML}_{n}\right)  = \frac{1}{n}\mathcal{I}^{-1}\left(\theta\right) = \frac{1}{n}\left(E\left[-\frac{\partial^{2}\mathcal{L}\left(\theta\right)}{\partial\theta\partial\theta'}\right]\right)^{-1}
    • The derivation used delta method and can be found here.
    • Relation with entral limits theorem (CLT): when MLE happens to be the average estimator, the asymptotic relationship (may?) becomes the CLT.
  6. More on Fisher Information \mathcal{I}\left(\theta\right)
    • Fisher information tells the variance of the log-likelihood’s first derivative with respect to \theta:
      l\left(\theta\right) = log L\left(X, \theta\right)
      \mathcal{I}\left(\theta\right) = var\left[l'\left(\theta\right)\right] = -E\left[l"\left(\theta\right)\right]
  7. Method of Moment Estimator (MIT 18.650)
    • \sqrt{n}\left(\hat{\theta}^{MM}_{n} - \theta\right) \sim \mathcal{N}\left(0, \Gamma\left(\theta\right)\right)
    • \Gamma\left(\theta\right) = \left(\nabla{\psi^{-1}_{M\left(\theta\right)}}\right)^T\Sigma\left(\theta\right)\nabla{\psi^{-1}_{M\left(\theta\right)}}
    • \Sigma\left(\theta\right) = Var\left(X, X^2, \cdots, X^d\right) is the covariance matrix of random vector \left(X, X^2, \cdot, X^d\right)
    • M\left(\theta\right) \equiv \left(m_1\left(\theta\right), \cdots, m_d\left(\theta\right)\right)^T = \psi\left(\theta\right)
  8. Multivariate CLT and Delta Method
    • \sqrt{n}\left(\hat\theta - \theta\right) \sim \mathcal{N}\left(0, \Sigma\right)
    • \sqrt{n}\left(g\left(\hat\theta\right) - g\left(\theta\right)\right) \sim \mathcal{N}\left(0, \nabla g\left(\theta\right)^T\Sigma\nabla g\left(\theta\right)\right)
  9. Important tests
    • Wald Chi-square test: hypothesis test if the estimated parameters from MLE is the same as a null-hypothesis (used Fisher information).
      n\Sigma^{k}_{i=1}\frac{\left(\hat{p_i} - p_i^0\right)^2}{p_i^0} \sim \chi^2_{k-1}
    • Likelihood ratio test: similar to Bayesian hypothesis test by assuming cost function  = 1 (for the two types of error) and prior P(H1)/P(H0) = P(H1)/(1 – P(H0)) = c, where c is a variable to control the type I and type II errors.
      Likelihood ratio test can be used to test a hypothesis on partial parameters.

      • Neyman-Pearson theorem tells that the likelihood ratio test is the test that has the smallest type II error when given the constrain (maximal allowed value)  on type I error.
    • Implicit hypotheses test: test g\left(\theta\right) = 0.

    • Student t test: when sample is from normal distribution \mathcal{N}\left(\mu_0, \sigma^2\right), but \sigma is unknown.
      \tilde{T_n} = \sqrt{n-1}\frac{\bar{X_n} - \mu_0}{\sqrt{S_n}} \sim t_{n-1}
      Student test is useful for small sample (n < 10). When n is large ( > 10), the distribution of t_{n-1} will be very close to a standard normal distribution.

      • \frac{nS_n}{\sigma^2}\sim \chi^2_{n-1}
      • Notice that student t-test needs to assume the data is gaussian.
    • Two sample test:
  10. Test for goodness of fit: test of hypothesis on distributions
  11. Classification criterion: Gini impurity or Information Gain (Entropy variance)
    • Gini: Gini = 1 - \sum_{i=1}^c p_i^2
    • Entropy: Entropy = 1 - \sum_{i=1}^c p_i \log p_i
    • For Gini and entropy reach maxima at uniform distribution.
  12. Regressor criterion: MSE or MAE.
  13. Time serial tests:
    • Durbin–Watson statistic: test autocorrelation
      d = \frac{\sum^T_{t=2}\left(e_t - e_{t-1}\right)^2}{e_t^2}
  14. bb
  15. bb


Posted in Study Notes | Leave a comment

Python Note

  • 0.02 multiplier
    • 0.02 * 35 = 0.7000000000000001
    • 2*35/100 = 0.7
  • round: Python 3 used bankers rounding
    • round(2.5) = 2
    • round(2.5 + 1e-10) = 3
    • round(1.5) = 2
Posted in programming | Leave a comment

Master Referral Thread

Updated on: 08/17/17


  • Join Robinhood and we’ll both get a share of stock like Apple, Ford, or Sprint for free. Make sure you use my link.
    Referral Link:


Posted in Uncategorized | Leave a comment

INT_MAX and INT_MIN in Python

  • In Python 2.x, one can use
  • In Python 3.x, it is
  • Use float
 float('inf')  float('-inf')
Posted in Uncategorized | Leave a comment

Add All Amex Offers in One Second

Update: multiple tabs don’t work perfectly with this method. It may fail to load some offers. Add the offers you are to use manually if you are using multiple tabs.

In Chrome, log into your account and load all offers. If you have multiple cards, open multiple tabs and load offers for each card in each tab.

Ctrl+Shift+C to open the developer tool and go to console. Enter:

Posted in Uncategorized | Leave a comment


Got back to use windows…

Use putty ssh to Unix server.

Tab completion, up-down arrow history are not working. I thought the putty configuration is not correct.  One hour tuning the config, not working. Then I realized maybe the shell is not bash…

Echo $0

Wtf ksh is… Ancient shell language.

Uname gives AIX… No wonder. IBM machine.

Now the solution is clear. “chsh” to find the available shells. Luckily bash is available in “/use/bin/bash”. Add “/usr/bin/bash” to the first line of ~/.profile.

Mission complete!
I am expecting more troubles using windows and AIX…

Posted in Uncategorized | Leave a comment

run firefox on a remote Linux server

I have a python script which opens firefox. I can ssh to my mac and run the script without issue. But when I ssh to the raspberry pi 2 (running Ubuntu Mate 15.04) and try to run the script, it shows “No Display Specified” error. The solution is to use “export DISPLAY=:0”

ssh name@server
export DISPLAY=:0

A firefox window will be opened in the remote server

Posted in Linux | Tagged , , | 1 Comment