冷静的大熊猫 · 医疗服务“扩容”!深圳恒生医院6大科室开设夜间门诊· 2 月前 · |
烦恼的野马 · 宋祖德曝章子怡“A先生” ...· 3 月前 · |
性感的黑框眼镜 · 在意大利 不知道这四种建筑风格会被人笑话 ...· 5 月前 · |
奔跑的鸭蛋 · Identifying Lines and ...· 1 年前 · |
不拘小节的牛排 · 使用变量作为参数的CSS ...· 1 年前 · |
svm lof scikit-learn |
https://scikit-learn.org/dev/auto_examples/neighbors/plot_lof_outlier_detection.html |
俊秀的盒饭
3 周前 |
__sklearn_is_fitted__
as Developer API
set_output
API
Go to the end to download the full example code. or to run this example in your browser via JupyterLite or Binder
The Local Outlier Factor (LOF) algorithm is an unsupervised anomaly detection
method which computes the local density deviation of a given data point with
respect to its neighbors. It considers as outliers the samples that have a
substantially lower density than their neighbors. This example shows how to use
LOF for outlier detection which is the default use case of this estimator in
scikit-learn. Note that when LOF is used for outlier detection it has no
predict
,
decision_function
and
score_samples
methods. See the
User
Guide
for details on the difference between outlier
detection and novelty detection and how to use LOF for novelty detection.
The number of neighbors considered (parameter
n_neighbors
) is typically set 1)
greater than the minimum number of samples a cluster has to contain, so that
other samples can be local outliers relative to this cluster, and 2) smaller
than the maximum number of close by samples that can potentially be local
outliers. In practice, such information is generally not available, and taking
n_neighbors=20
appears to work well in general.
# Authors: The scikit-learn developers
# SPDX-License-Identifier: BSD-3-Clause
Generate data with outliers#
import numpy as np
np.random.seed(42)
X_inliers = 0.3 * np.random.randn(100, 2)
X_inliers = np.r_[X_inliers + 2, X_inliers - 2]
X_outliers = np.random.uniform(low=-4, high=4, size=(20, 2))
X = np.r_[X_inliers, X_outliers]
n_outliers = len(X_outliers)
ground_truth = np.ones(len(X), dtype=int)
ground_truth[-n_outliers:] = -1
Fit the model for outlier detection (default)#
Use fit_predict
to compute the predicted labels of the training samples
(when LOF is used for outlier detection, the estimator has no predict
,
decision_function
and score_samples
methods).
from sklearn.neighbors import LocalOutlierFactor
clf = LocalOutlierFactor(n_neighbors=20, contamination=0.1)
y_pred = clf.fit_predict(X)
n_errors = (y_pred != ground_truth).sum()
X_scores = clf.negative_outlier_factor_
Plot results#
import matplotlib.pyplot as plt
from matplotlib.legend_handler import HandlerPathCollection
def update_legend_marker_size(handle, orig):
"Customize size of the legend marker"
handle.update_from(orig)
handle.set_sizes([20])
plt.scatter(X[:, 0], X[:, 1], color="k", s=3.0, label="Data points")
# plot circles with radius proportional to the outlier scores
radius = (X_scores.max() - X_scores) / (X_scores.max() - X_scores.min())
scatter = plt.scatter(
X[:, 0],
X[:, 1],
s=1000 * radius,
edgecolors="r",
facecolors="none",
label="Outlier scores",
plt.axis("tight")
plt.xlim((-5, 5))
plt.ylim((-5, 5))
plt.xlabel("prediction errors: %d" % (n_errors))
plt.legend(
handler_map={scatter: HandlerPathCollection(update_func=update_legend_marker_size)}
plt.title("Local Outlier Factor (LOF)")
plt.show()
Total running time of the script: (0 minutes 0.088 seconds)
Novelty detection with Local Outlier Factor (LOF)
Novelty detection with Local Outlier Factor (LOF)
Comparing anomaly detection algorithms for outlier detection on toy datasets
Comparing anomaly detection algorithms for outlier detection on toy datasets
Evaluation of outlier detection estimators
Evaluation of outlier detection estimators
Outlier detection on a real data set
Outlier detection on a real data set
冷静的大熊猫 · 医疗服务“扩容”!深圳恒生医院6大科室开设夜间门诊 2 月前 |