Dzwebs.Net

撰写电脑技术杂文十余年

如何获取HTML页面中的图片地址并将图片下载保存到自己的网站目录?

Admin | 2008-8-21 10:14:20 | 被阅次数 | 10261

温馨提示!

如果未能解决您的问题,请点击搜索;登陆可复制文章,点击登陆

  做文章类的网站,难免存在抄袭或修改别人的文章,占为己有的可能!

  并非是偷者有罪,而是奉行拿来主义;毕竟个人的观点和学问都有所限制,不可能面面俱到;拿别人的不等于偷,更有可能的就是,在别人的基础之上在进行升华,以求精益求精!

  为此,众多站长可能遇到过,“抄袭”别人的网站的时候,存在图片,如何将其自动保存到自己网站的目录呢?

  以下为网上抄的源代码!

using System;
using System.Text;
using System.Text.RegularExpressions;
using System.IO;

namespace zhang.Common
{
    public class HanlerFiles
    {
        private string[] GetImgTag(string htmlStr)
        {
            Regex regObj = new Regex("<img.+?>", RegexOptions.Compiled | RegexOptions.IgnoreCase);
            string[] strAry = new string[regObj.Matches(htmlStr).Count];
            int i = 0;
            foreach (Match matchItem in regObj.Matches(htmlStr))
            {
                strAry[i] = GetImgUrl(matchItem.Value);
                i++;
            }
            return strAry;
        }


        private string GetImgUrl(string imgTagStr)
        {
            string str = "";
            Regex regObj = new Regex("http://.+.(?:jpg|gif|bmp|png)", RegexOptions.Compiled | RegexOptions.IgnoreCase);
            foreach (Match matchItem in regObj.Matches(imgTagStr))
            {
                str = matchItem.Value;
            }
            return str;
        }

        /**//// <summary>
        /// 根椐Html内空自动识别图像文件,并下载到服务器指定目录
        /// </summary>
        /// <param name="strHTML"></param>
        /// <param name="path"></param>
        /// <returns></returns>
        public int SaveUrlPics(ref string strHTML, string path)
        {
            string[] imgurlAry = GetImgTag(strHTML);
            try
            {
                for (int i = 0; i < imgurlAry.Length; i++)
                {
                    //WebRequest req = WebRequest.Create(imgurlAry[i]);
                    string preStr = System.DateTime.Now.ToString() + "_";
                    preStr = preStr.Replace("-", "");
                    preStr = preStr.Replace(":", "");
                    preStr = preStr.Replace(" ", "");
                    WebClient wc = new WebClient();
                    wc.DownloadFile(imgurlAry[i], HttpContext.Current.Server.MapPath(path) + "/" + preStr + imgurlAry[i].Substring(imgurlAry[i].LastIndexOf("/") + 1));
                    strHTML = strHTML.Replace(imgurlAry[i], path + preStr + imgurlAry[i].Substring(imgurlAry[i].LastIndexOf("/") + 1));
                }
               
            }
            catch (Exception ex)
            {
                //return ex.Message;
            }
            return imgurlAry.Length;
        }

    }
}


该杂文来自: 网站开发杂文

上一篇:ASP.NET操作文件、文件夹代码全集

下一篇:抛弃IIS,青睐Apache Web服务器的配置方法

网站备案号:

网站备案号:滇ICP备11001339号-7

版权属性:

Copyright 2007-2021-forever Inc. all Rights Reserved.

联系方式:

Email:dzwebs@126.com QQ:83539231 访问统计